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SUMMARIZING OPEN DOCUMENT AND RECORDING MEDIUM RECORDING ITS 



2000-011003 [JP 2000011003 A] 
January 14, 2000 (20000114) 
INAGAKI HI ROTO 
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ABSTRACT 

PROBLEM TO BE SOLVED: To correctly and automatically summarize a summary 
object document by making the summary object document and a similarity 
summary object document a summary object document group , collecting 
sentences which are judged to be similar in the group and expressing the 
object document to be summarized in a proper expression and proper order. 
SOLUTION: A summary object document group morpheme analyzing part 6 
collects a summary object document and a similarity summary object 
document and makes them a summary object document group , divides 
words, executes classifying them into parts of speech and analyzes a 
morpheme. A summary object document group event analyzing part 7 
analyzes events to be described in the document based on the morpheme 
analysis result of the group. A similarity meaning judging part 8 judges 
the sentences having the same event analysis result to be the similar ones 
in terms of meaning based on the event analysis result and extracts the 
similar sentences for the whole sentences included in the group . A 
document summary output part 9 collects the sentences which are judged 
to be similar in the group and outputs them in expression and proper 
order . 
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ABSTRACT 

PROBLEM TO BE SOLVED: To facilitate an operation and a processing when 
contents of summary in a summary table are classified into desired summary 
items and summed up. 

SOLUTION: A classification and summary processing of the summary items in a 
database by executing a data analyzing processing regarding the summary 
table, reading a specified database record to be processed from the 
database stored in a storage device 7, setting hierarchal structure to 
relate the summary items with one another by grouping each summary 



item set in the database record by every analyzing key item , 

setting an index key item and a name of a data item to be analyzed in a 
recording form of the database for analysis to be explained later, 
simultaneously setting the analyzing key item and calculation item by 
every group in the recording form of the analyzing key and the 
calculation item to be explained later, displaying an analyzing browser 
screen in which selection keys for classification and summary in a line 
direction and a column direction based on the analyzing key item and 
simplifying a narrowing operation of the summary items simply by operating 
Lhe selection keys for classification and summary by a CPU 2, 

i Or Y RIGHT: { C ) 1 9 9 9 , J PO 
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ABSTRACT 

PROBLEM TO BE SOLVED: To provide a document processor and a document 
preparation method capable of judging how similar plural documents 
are by a document unit, gathering the documents of a high similarity 
degree, preparing a summary for respective document groups and 

: r-*parinc: an easily readable summary . 

' Document vectors for Che respective plural documents to be 

.- ; zoci are obtained and the difference of the document vectors is taken 
:>j ! .ween the respective documents. The identity of a topic is judged 
appending on whether a cosine value between the two successive documents is 
high or low. The documents defined as belonging to the same topic (that is 
Lhe case that the similarity degree is high) are gathered in the time order 
of write and applied to summary extraction algorithm. It is repeated for 
the documents defined as belonging to the respective topics, respective 
partial summaries are bound and the whole summary is generated. 
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ABSTRACT 

PROBLEM TO BE SOLVED: To easily understand information in an inputted 



document by means of a user even if the language of the input document is 
".ol mother congue and even if the large amount of input documents exist. 



S;*L'JTION: The summary part of the input document and a translation part 
• tans .«ting the input document and the summarized result into the other 
natural language are provided. The input documents (facsimile document, 
electronic mail document, designated file and the like) are received from a 
local or remote information source (ST1) . The language on the input 
document is judged and a field is judged by using an (if-then) inference 
rule and statistical information (ST2 and ST3) . When summary is required 
for the input document, the summary part generates summary (ST4 and ST5) . 
The summary is generated by filling the slot of a template, for example. 
When translation into the other language is required for the input 

document and the summarized result , the translation part generates 
translation (ST6 and ST7) . A language judged result, a field judged 

result, the summarized result and a translated result are sent to the 
local or remote user . 
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ABSTRACT 

PROBLEM TO BE SOLVED: To obtain the image processing unit which surely 
stores image data even when various image processing such as magnification, 
decoration and edit is conducted before the compressed image data are 
stored in an image data storage means by predicting an image compression 
rate for one page after the image processing based on the image processing 
mode . 



.""S-LMTION: A compression rate prediction circuit 160 of an image 
: r"-M-essjnq section 11 predicts a compression of image data for one page 
.stored in page memory circuits 119, 120 based on magnification and 
decoration information obtained from a controller 123 according to the 
image processing mode set in an operation section 3 and based on a 
density mean value or the like of an image with high correlation with 
the compression rate by selected density conversion circuits 129, 130 and 
gradation conversion circuits 131, 132. Whether or not the image data 
compressed based on the predicted compression rate are stored in a 
storage means is discriminated, and when unable to be stored, the input of 
image is limited. Thus, even when the data amount of the compressed image 
after image-processed is indefinite, the image data are surely stored. 
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ABSTRACT 

rl'RPOSE: To make it possible to automatically select information at the 
'ime of Lcs reception by providing the information selecting/receiving 
system with a document filter for deciding the validity of storage in a 
receiving part in accordance with an attribute described in the document 
summary of received information. 

CONSTITUTION: A receiving condition to be compared with the attribute of 
a document summary is previously set up in the document filter 15. 
Information is temporarily received from a communication line 2 to a 
prereceiving part 14 through a communication control part 12 and whether 
the document summary of the received information conforms to the receiving 
condition or not is decided by the filter 15. At the time of conforming to 
the receiving condition, the information received by the prereceiving part 
14 is transferred to a receiving part 13, and in the case unconformable to 
the receiving condition, the information is immediately erased and thrown 
away by the prereceiving part 14 without being transferred to the receiving 
part 13. Consequently, the received information can be automatically 
selected only by previously setting up the receiving condition in the 
document filter. 
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XRPX Acc No: N00-307818 

Production of topic summary for set of documents in computer 
system, involves labeling document subset obtained by division of 
document set corresponding to retrieved predefined information, with 
topic 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 
Inventor: BARRETT R C; COHEN A L; MAGLIO P P; SHELDON M A 
Number of Countries: 085 Number of Patents: 003 
Patent Family: 
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E? 1224 578 Al E G06F-017/30 Based on patent WO 200029985 

Designated States (Regional): AT BE CH CY DE DK ES FI FR GB GR IE IT LI 

LU MC NL PT SE 

Abstract (Basic) : WO 200029985 Al 

NOVELTY - Information about accessing technique, identifier, 
accessing sequence of document is retrieved. Based on retrieved 
information, set of documents which are divided into subsets. Each 
subset is labeled with a topic (32) . 

USE - For producing topic summary for set of documents on 
computer system. For conveying expertise of users of computer system to 
other users. 

ADVANTAGE - Benefits broad range of users of the expertise of 
experts as expressed via expert's access and use of documents. Parses 
document browser trails or paths into sequences of documents related to 
common topic, automatically using simple technique. Makes available 
traces of expert's browsing and searching behavior, thereby facilitates 
use of distributed expertise within organization. Helps users to find 
documents that are already read with expertise in specific field. 

DESCRIPTION OF DRAWING (S) - The figure shows system for capturing 
and conveying expertise in document usage. 

Topic (32) 

pp; 35 DwgNo 1/7 

Vn\* T«nns: PRODUCE; TOPIC; SUMMARY; SET; DOCUMENT; COMPUTER; SYSTEM; 

DOCUMENT; SUBSET; OBTAIN; DIVIDE; DOCUMENT; SET; CORRESPOND; RETRIEVAL; 

PREDEFINED; INFORMATION; TOPIC 
Derwent Class: T01 

International Patent Class (Main) : G06F-017/30 
File Segment: EPI 
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Data analysis system for analyzing data file containing data records each 
containing parameters, for statistical analysis to predict customer or 
potential customer behavior e.g. credit risk 
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Patent Family: 
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Applications (No Type Date): US 96651319 A 19960522 
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: Ko Kind Lan Pg Main IPC Filing Notes 

A 42 G06F-017/30 

Abstract (Basic): US 6026397 A 

NOVELTY - The data analysis system analyses a data file containing 
a number of customer records. Each record contains a number of customer 
parameters, and the system processes the records by segmentation, 
clustering and prediction of future results. 

DETAILED DESCRIPTION - The data analysis system includes an input 



tox* receiving a data file and a processor having several functions 
including a segmentation function for segmenting data records into a 
number of segments based on parameters of the records. The functions 
also include a clustering function for clustering records having 
similar parameters. A prediction function predicts expected 
future results from parameters in the data records . 

An INDEPENDENT CLAIM is included for a method for analyzing a data 
file containing a number of data records. 

USE - Statistical analysis e.g. to predict customer or potential 
customer behavior e.g. propensity to respond to direct mail or 
telemarketing, product reference, profitability, credit risk and 
probability of attrition. 

ADVANTAGE - Provides for segmenting records into logical groups 
, and provides for clustering records into statistically significant 
groups . 

DESCRIPTION OF DRAWING { S ) - The drawing shows a clustering analysis 
window in accordance with the data analysis invention. 
Clustering analysis window (274) 
Toolbar (276) 
Cluster map (278) 

Parameter statistics information (280) 
Open new input data configuration button (284) 
Save new input data configuration button (288) 
Select results button (292) 
pp; 42 DwgNo 13/34 
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Summary characteristics extraction system for disclosure document summary 
apparatus - has analysis unit which analyzes event described in document 
based on morphological analysis result of each document in summary 
objective document group 
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Abstract (Basic) : JP 2000011003 A 

NOVELTY - An analysis unit (7) analyzes event described in a 
document based on the morphological analysis result of each document 
in a summary objective document group . A semantic content 
judging unit (8) extracts sentences with similar event analysis 
result and collects the extracted sentences. A document summary output 
unit (9) which expresses summary of the document in suitable format. 
DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
recording medium for storing summary characteristics extraction program 
of disclosure document summary apparatus. 

USE - For extracting summary characteristics in disclosure summary 
apparatus . 

ADVANTAGE - Since sentences with similar event analysis result are 



extracted and collected for expressing summary of document in suitable 
format, precise summary of document can be obtained automatically. 
DESCRIPTION OF DRAWING (S) - The figure shows block diagram of 
components in summary characteristics extraction system for disclosure 
document summary apparatus. (7) Summary objective document 
group event analysis unit; (8) Semantic content judging unit; (9) 
Document summary output unic. 
Dwg .1/3 
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CIP of application US 9840219 
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Provisional application US 9886410 
CIP of patent US 6263337 
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Abstract (Basic) : WO 9962007 Al 

NOVELTY - The need for further accessing of the data for further 
clustering of records in the database, is determined. Based on the 
deterrninat ion result, additional number of records are read from 
database memory and stored in the rapid access memory for further 
updating of cluster model. 

DETAILED DESCRIPTION - The data records having both discrete and 
ordered attributes are read from the database memory and a portion of 
read data records is stored in the rapid access memory. The cluster 



rociel characterizing che data within the database and including a table 
; : probabilities for the enumerated or discrete data attributes of 
data records for each cluster , is initialized. The cluster model 
for ordered data attributes, comprises a mean and covariance for each 
cluster. The cluster model from the database records stored in the 
rapid access memory, are then updated. For this updating, the table of 
discrete attribute probabilities for cluster is adjusted by 
calculating a weighted sum of the data records stored in the rapid 
access memory and the weighted sum for data records already 
summarized in the cluster model. The database records in the rapid 
access memory is then summarized and the summarized database are stored 
within the memory. INDEPENDENT CLAIMS are also included for the 
following : 

(a) data evaluation apparatus for database; 

(b) data clustering software 

USE - For data clustering in database management system used in 
business organization, companies and for statistics, pattern 
recognition, machine learning application and in science and 
engineering fields. Also in data mining applications including 
marketing, fraud detection in credit cards, banking, 
telecommunications, customer relation and churn minimization in 
airlines, telecommunication services, internet services, direct 
marketing on web and live marketing in electronic commerce. 

ADVANTAGE - Enables visualizing, summarizing, navigating and 
predicting properties of data/clusters in the database, efficiently, 
'r.'ip parameters enable to assign database records to a cluster in a 
i ric ribi Lis zic fashion, reliably. Since the probabilistic clustering 
:u:bles reliable sampling and indexing, the data accessing efficiency 
.s improved greatly. Enables effective and accurate clustering in one 
or less database scans. The continuous fields are discretized prior to 
applying the clustering technique, if the database contains both 
discrete and continuous fields. 

DESCRIPTION OF DRAWING ( S } - The figure shows the flowchart 
explaining the clustering procedure for mixed continuous and 

discrete data. 
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Database indexing method employed for indexing and locating information 

in WWW, LAN, WAN 
latent Assignee: DIGITAL EQUIP CORP (DIGI } 
Inventor: BURROWS M 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 
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Patent Details: 
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US 5963954 A 43 G06F-017/30 Cont of application US 96700748 
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Abstract (Basic) : US 5963954 A 



NOVELTY - Index entries having identical bucket numbers are written 
to a single index file in collating order of parsed unique words. A 
summary file is generated for each index file . The index files 
and their corresponding summary files are grouped into tiers of 
files . 

DETAILED DESCRIPTION - Batches of records in database are parsed 
into words and location with each word representing a portion of 
information parsed from a particular record and the locations are 
sequentially assigned to words in parsing order. An index entry is 
generated for each unique word. Each index entry includes the unique 
word and all of the locations where the unique word occurs in the 
database. Each unique word is hashed to determine a bucket number. 

USE - For indexing and locating information in world wide web 
{WWW), local area network (LAN), wide area network (WAN). 

ADVANTAGE - Enables scoring entries for large databases by indexing- 
che information of database inco array of files. 

DESCRIPTION OF DRAWING (S) - The figure shows block diagram of 
concern: attributes generated by a search engine employing the database 
indexing method. 
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Document processing apparatus for automatic production of summary to 
various books, papers and reports - produces summary of documents 
automatically for every similar document group , grouped by similar 
document group production unit 

Patent Assignee: JUST SYSTEM KK (JUST-N) 
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Patent Details: 
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JP 11045288 A 11 G06F-017/30 

Abstract (Basic): JP 11045288 A 

NOVELTY - The summary of a document is produced automatically by a 
summary production unit for every similar document group grouped 
by similar document group production unit. Similarity between the 
documents is computed by a similarity calculation unit with several 
documents of predetermined format acquired by a document acquisition 
unit. DETAILED DESCRIPTION - INDEPENDENT CLAIMs are included for the 
following: document processing method; a processing program memory 
medium 

USE - For automatic production of summary to various books, papers 
and reports. 

ADVANTAGE - Unifies summary of every similar group of 
documents offering convenience to read. DESCRIPTION OF DRAWING (S) - 
The figure shows block diagram of document processing apparatus. 
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Abstract (Basic) : WO 9903049 A 

Method consists in providing a measurement system, performing the 
measurement, writing the result to a data file on a first storage 
device and repeating to accumulate results in the data file . One or 
more summary values is generated from the measurement results, 
these are saved to a summary file on a storage device and compared 
with a predefined value. The data file is then saved to one of three 
storage devices if the compared summary value is outside an 
acceptable range, or when a trigger indicates that a condition is 
present . 

*;sE - Method is for intelligent data acquisition for a measurement 
o.g. an absorption spectroscopy measurement system, particularly 
\: • .iaent data acquisition from a system which can be used to monitor 
i semi conduct or processing tool. 

ADVANTAGE - Method substantially reduces the size of the data 
storage system and the time required for review of the collected data. 
It also eliminates the need for a compatible output signal from a 
semiconductor processing tool . 
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Abstract (Basic) : WO 9847083 A 

The system includes a target data item store for storing target 
data items. A sectioning device divides the data set into sections and 
compares each section against the target data items. A calculator 
calculates a ranking value for each section. The ranking value is 
dependent on the outcome of the comparisons . A compilation element 



compiles a summary of the data set by selecting one or more 
sections according to the respective ranking values. 

The system further includes a user input for inputting target data 
items co the target item store. A key data item identifier identifies 
key data items of the data set . A distribution value calculator 
ralcuiates a distribution value for each section dependent on the 
distribution of the key data items in the section. A ranking value 
adjuster adjusts the relevant ranking value in a manner dependent on 
the distribution value for each section. 

ADVANTAGE - Enables summarising tool to generate summary of data 
set that includes target data items specified by user for whom 
summary is generated. 
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Document database construction method - involves forming coincidence 
network, showing expression format of link between specified word group 
suggesting document theme based on computed coincidence probability 
, to be displayed 
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Abstract (Basic): JP 8314980 A 

The method involves extracting independent words from the input 
document. A coincidence table, which records the coincidence words 
corresponding to the extracted independent words and the frequency of 
coincidence, is then produced. Then, the coincidence probability 
which expresses the coincidence related strength and the expected value 
of frequency of coincidence are computed referring to the coincidence 
table. 

An independent word group which suggests the theme of the 
document is specified, by comparing the expected value and 
frequency of coincidence. Then, a link is established between the 
independent words of the specified word group which suggests the 
theme. A coincidence network which shows the expression format of the 
link between the words, which is to be displayed is formed based on the 
computed coincidence probability . 

ADVANTAGE - Enables operator to grasp document theme. Enables 
general purpose production of database and extraction of coincidence 
network, irrespective of kind of document. Eliminates necessity of 
building large scale grammar dictionary. 
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Multi-dimensional search tree database access method - involves searching 
dimension nodes to find index and summary nodes corresponding to 
dimension values and displaying summary information when correlation 
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Abstract (Basic): US 5404512 A 

The method involves providing a combination of dimension values 
identifying a set of database records . Numerous dimension nodes are 
searched to locate a dimension node corresponding to the values. 
Detail index and summary nodes are identified corresponding to the 
values. A record pointer is read from the detail index node 
identifying records in a detail table. Summary information is 
calculated from the set of detail records when the dimension node 
is a detail index node. 

When a summary node corresponds to the combination of dimension 
values, summary values are read from the summary node determined 
from detail records corresponding to the combination of dimension 
values stored in the summary node. When the combination of dimension 
values corresponds to a detail index node summary information 
calculated form the detail records is displayed. The summary 
values from the summary node are displayed when the combination of 
values corresponds to a summary node. 

ADVANTAGE - Provides rapid summary information for large record 
sets . 
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Abstract (Basic) : WO 9322732 A 

The clustering system includes a retrieval unit for retrieving 
summary data characteristic of a number of case records. Each case 
record includes case record data representative of a known value of at 
least one of a qualitative and a quantitative case record variable. A 
comparator generates comparison signals indicating the distances 
between case records. 

The clustering system further includes a partitioning unit which 
selectively partitions the case records in accordance with the 
comparison signals to form child clusters therefrom. Characteristics 
summary data are calculated for each cluster . Memories are provided 
for storing the summary data of clusters . 

ADVANTAGE - Domain independent and capable of making use of all 
available case record data. 
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Abstract (Basic) : WO 200171530 A2 

NOVELTY - An acquisition unit in service provider node (100), 
acquires a user profile (400) reflecting each user's interests and 
probable needs, from which an extraction unit (110) extracts a group 
profile summary (450) specifying numbers of users in specified 
groups. An advertiser node (300a) transmits advertising messages to 
service provider node based on the extraction result which forwards the 
received messages to user terminals (200) . 

DETAILED DESCRIPTION - An INDEPENDENT CLAIM is also included for 
information providing method. 

USE - Especially for providing advertising information and also for 
providing other services including commercial information services e.g. 
for flight schedules, price and availability of goods, purchase of 
• . ickeis for particular flight itinerary selected from flight schedule 
: remote user terminals, using user profile information. 

ADVANTAGE - Advertisement can be routed to particular users or 
::oi!ps of users without disclosing user's identities to the advertiser. 

DESCRIPTION OF DRAWING ( S } - The figure depicts the block diagram of 
the information system. 

Service provider node (100) 
Extraction unit (110) 
User terminals (200) 
Advertiser node (300a) 
User profile (400) 

Group profile summary (450) 
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Abstract (Basic): US 6023695 A 

NOVELTY - The method involves presenting summary table creation 
recommendations to a user, or automatically generating at least one of 
the summary tables in the summary table creation recommendations 
after generating the summary table creation recommendations based on 
collected statistics on past queries submitted to a database management 
system . 

DETAILED DESCRIPTION - The generation of the summary table creation 
recommendations includes an evaluation of both the frequency and 
execution times of the past submitted queries. The generated summary 
table creation recommendations comprises of ranked past queries 
submitted over a time period- The ranking of a past query is based on 
the expression, the logarithm of f squared times the quantity cpu plus 
I, where f is the frequency with which the past query is submitted 
curing the time period and cpu is the average CPU execution time for 
the past query during the time period. 

INDEPENDENT CLAIMS are also included for the following: 

(a) a self-monitoring system for automatic tuning according to 
system demands 

(b) ; the computer system adapting the summary table management; 

(c) and a computer-readable medium storing the program for summary 
• able management . 

USE - Used in a computer system. 

ADVANTAGE - Ensures efficient execution of user queries while 
minimizing required machine resources. Automatically generates an 
appropriate SQL query, allocates memory for the summary table to be 
created, executes the generated SQL query, and populates the summary 
table with the appropriate data set . Automatically deletes the 
selected summary table from the database upon selection of a 
recommendation to delete a summary table. Is not limited to any 
specific combination of hardware circuitry and software. Creates and 
maintains the most effective summary tables. 

DESCRIPTION OF DRAWING (S) - The figure is a flowchart depicting a 
preferred methodology for creating database summary tables. 
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Abstract (Basic) : WO 200041122 A2 

NOVELTY - A method (I) of identifying a difference between at least 
2 data sets made up of ordered elements utilizing internal features 
within the data sets for calculations relating to normalization, 
scaling and difference finding, is new. 

DETAILED DESCRIPTION - A method (I) of identifying a difference 
between 2 groups, comprising: 

(1) providing a first group (Gl) having 1 or more elements in a 
i i rst data set (DS1) ; 

(2) applying at least 1 transformation to DS1 to provide a 
transformed data set (the transformation is either a normalizing 
calculation, an averaging calculation or a scaling calculation); and 

(3) distinguishing differences, if present, between elements of the 
transformed DS1 and a second group (G2) having 1 or more elements 

in a second data set (DS2) . 

Therefore identifying a difference between the groups. 
INDEPENDENT CLAIMS are also included for the following: 

(i) a display device (II) displaying a representation of a 
difference between 2 or more transformed data sets (DSs) , in which each 
DS comprises ordered elements and the DSs are transformed by at least 1 
calculation (either a normalizing calculation, an averaging calculation 
or a scaling calculation) (i.e. (II) is used for displaying results 
obtained via (I)); and 

(ii) a representation of a difference between normalized, averaged 
and scaled DSs, in which the DSs comprise ordered elements (a 
representation of results from (I)). 

USE - (I) is used for identifying differences between data sets 
made up of ordered elements , and is especially useful for handling 
data obtained from genetic analysis. It is suitable for identifying 



constant and/or unvarying components in a data set which may serve as 
reference markers in normalizing, scaling ands distinguishing 
differences between data sets in such a study. In particular, it is a 
robust method for normalizing scale and finding differences in 
experiments related to the differential expression of genes in cells 
and tissues subjected to specific experimental treatments. 

ADVANTAGE - (I) counteracts or overcomes the effects of noise when 
comparing data sets (DSs) in experimental studies. Scaling landmarks 
can be identified automatically in data sets being compared to one 
another . 
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Abstract (Basic): WO 9941878 Al 

NOVELTY - A request r, a policy assertion (fO, POLICY), and n-1 
credential assertions { f 1 , si ) , { f n-1 , sn-1 ) are received. Each credential 
assertion includes a credential function (fi) and a credential source 
(si). An acceptable record set (S) is initialized, and each 
assertion (fi, si) is run and the result is added to the acceptance 
record set (S) . {i represents the integers from n-1 to 0 ) . 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are included for; an 
apparatus for compliance checking in a trust-management system; a trust 
management platform; a trust-management system; a medium storing 
instructions for execution by a processor. 

USE - Compliance checking in a trust-management system. 

ADVANTAGE - Provides method, solvable in polynomial time, that 
checks the compliance of a request with a policy assertion based on 
credential assertions. 

DESCRIPTION OF DRAWING { S ) - The drawing is a flow diagram of a 
method for compliance checking for a trust-management system of the 
invention . 
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Abstract (Basic) : FR 2524740 A 

The process for compressing a digitised image uses a series of 
digital values Bij representing the brightness. Bij is characteristic 
of the point in row j situated on the ith line of the image. Three 
values are calculated for each of the values of j from 1 to N. SI : 
(aij= Bij - B'i-l,j-l) S2 : (bij= Bij -B'i-l,j-l) S3 : (cij= Bij - 
B'ij, j+1) . 

N is the number of points in a line, and is a fixed number , 
between 0 .5 and 1 chosen to optimise the compression of 
information according to the type of image processed. The B* values are 
the expanded-compressed values of the corresponding B values. The 
values are coded into blocks formed of consecutive values of B. 
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ABSTRACT EP 751470 Al 

A method of automatically generating feature probabilities that allow 
later automatic generation of document extracts. The computer system 
generates the probabilities by analyzing each document a document at a 
uime. First, the computer system designates one of the documents as a 
selected document. Next, the computer system analyzes each sentence of 
the selected document to determine the value of the paragraph feature and 
the value of the uppercase feature. The computer system repeats this 
rjfforc for each document of the document corpus. Afterward, the number of 
"r-jrrencss of each value of each feature is calculated and is used to 
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...SPECIFICATION summary sentence is the product of two joined document 
sentences, only of which will be designated as the matching sentence. 

B. Training to Generate Feature Probabilities 

Training determines feature probabilities that can be used later to 
automatically extract from a document the same set of sentences that 
an expert might select for a summary . Training requires a feature set 
and a matched training corpus. Both the preferred feature set and a 
method of matching a training corpus are described in detail above. Given 
these prerequisites, during training... 
...each of its possible values within all sentences, as well within 

sentences matching summary sentences. Processor 11 uses these counts to 
determine two kinds of probabilities : 

1. The probability of observing a value of a feature j in a sentence 
included in the summary S, P{Fj))ls ( set membership) S) ; and 

2. The probability of feature j taking the observed value, P(Fj))). 
Figure 6 illustrates in flow diagram form instructions 300 executed by 

processor 11 to determine the required probabilities from the matched 
training corpus. Instructions 300 may be stored in machine readable form 
in solid state memory 25 or on a floppy disk placed. . . 

...SPECIFICATION summary sentence is the product of two joined document 
<- a ntences, only of which will be designated as the matching sentence. 

r . 7:a.:n.:ng to Generate Feature Probabilities 

"raining determines feature probabilities that can be used later to 
automatically extract from a document the same set of sentences that 
an expert might select for a summary . Training requires a feature set 
and a matched training corpus. Both the preferred feature set and a 
method of matching a training corpus are described in detail above. Given 
these prerequisites, during training. . . 

...each of its possible values within all sentences, as well within 

sentences matching summary sentences. Processor 11 uses these counts to 
determine two kinds of probabilities : 

1 . The probability of observing a value of a feature j in a sentence 
s included in the summary S, P(Fj))ls ( set membership) S) ; and 

2. The probability of feature j taking the observed value, P ( F j ) ) ) . 
Figure 6 illustrates in flow diagram form instructions 300 executed by 

processor 11 to determine the required probabilities from the matched 
training corpus. Instructions 300 may be stored in machine readable form 
in solid state memory 25 or on a floppy disk placed. . . 
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■ : . Ar>s r race 

/'■. and method for making program recommendations to users of a 

network -based video recording system utilizes expressed preferences as 
inputs to collaborative filtering and Bayesian predictive algorithms to 
rate television programs using a graphical rating system. The predictive 
algorithms are adaptive, improving in accuracy as more programs are 
rated . 

French Abstract 

L' invention concerne un systeme et un procede servant a recommander des 
programmes a des utilisateurs d 1 un systeme d ' enregistrement video en 
reseau. Le systeme selon 1' invention utilise des preferences exprimees 
comme entrees pour le filtrage cooperatif et des algorithmes predictifs 
bayesiens pour evaluer des programmes de television a l'aide d'un systeme 
devaluation graphique. Les algorithmes predictifs sont adaptatifs, leur 
precision s'ameliorant done avec 1 ' augmentation du nombre de programmes 
evalues . 
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19th month from priority date 

Fulltext Availability: 
Detailed Description 

: uiieci Description 

... to Lang, et al. is overwhelmingly one of exclusion, with the 

"'jltiplicity of filter layers. However in a system, the aim of which is 
:o predict items most likely to appeal to a user, and suggest items 
likely to appeal to a user, the redundant filtering of the present system 
would . . . 

...that it can learn and adapt to shifts 

in user preferences. It would be desirable to provide a distributed 
collaborative filtering engine that guaranteed a user 's privacy by 
eliminating the necessity of correlating the user to other user *s 
or groups of users . It 
SUMMARY OF THE INVENTION 



The invencion provides a network-based intelligent system and method for 

predicting rating for items of media content according to how likely 
they are no appeal to a user based on the user's own earlier ratings. 
Collaborative filtering and concent-based prediction algorithms are 
integrated into a single, 

network-based system. System heuristics determine which of the provided 
algorithms provide the most reliable predictor for any single new 
content item. 

In a preferred embodiment of the invention, a network-based video 
recording system rates television programs according to the... 

17/5, K/34 (Item 23 from file: 349) 

DIALOG (R) File 34 9: PCT FULLTEXT 

(c) 2004 WIPO/Univentio. All rts. reserv. 

C0770309 

SYSTEM AND METHOD FOR CAPTURING AND MANAGING INFORMATION FROM DIGITAL 
SOURCE 

SYSTEME ET PROCEDE DE COLLECTE ET DE GESTION D 1 INFORMATIONS A PART I R D'UNE 
SOURCE NUMERIQUE 

Patent Applicant /Assignee : 

I HARVEST CORPORATION, 130 Shoreline Drive, Second Floor, Redwood Shores, 
CA 94065, US, US (Residence), US (Nationality) 
inventor (s) : 

WADHWANI David S, 1727 Ulloa Street, San Francisco, CA 94116, US, 
BUCHHEIM Dennis S, 570 Ashton Avenue, Palo Alto, CA 94306, US, 
BUCHHEIM Richard S, 220 Valdez Avenue, Half Moon Bay, CA 94019, US, 
RAPOSA Scott A, 111 N. Rengstorff Avenue, #76, Mountain View, CA 94043, 
US, 

MALASKY Ethan F, 1449 South Van Ness Avenue, #2, San Francisco, CA 94100, 
US, 

Legal Representative: 

LEHMANN Eileen A (et al) (agent), Fenwick & West LLP, Two Palo Alto 
Square, Palo Alto, CA 94306, US, 
Patent and Priority Information (Country, Number, Date) : 

Patent: WO 200102984 A2-A3 20010111 (WO 0102984) 

Application: WO 2000US18111 20000630 (PCT/WO US0018111} 

Priority Application: US 99142237 19990702 
Designated States: AE AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CU CZ DE DK 

EE ES FI GB GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT 

LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT 

UA UG UZ VN YU ZA ZW 

(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE 

(OA) BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Main International Patent Class: G06F-017/30 
Publication Language: English 
Filing Language: English 
Full text Availability: 

Detailed Description 

Claims 

Fulltext Word Count: 11513 
English Abstract 

A system and method is provided for allowing a user to capture in 
addition to other items, items of granular information, meaning the 
subcomponents of information from a document or encompassing file which 
are of concern to the user. In addition, a Context Database is created 
for the user which includes the captured items, files associated with the 
items, and meta-data that includes keywords associated with the item. The 
Context Database is queried for the generation of a Context Summary in 
response to selections by the user or words entered by a user. The 
C -ntex': Summary is used to enhance searching and to select targeted 

ver . isements for display to the user based on the user's currently 
active information. 



French Abstract 

L ' invention concerne un systeme et un procede qui permettent a un 

utilisateur de collecter a partir d'un document ou d'un fichier global, 

en sus d'autres elements, des elements d ' information fragmentes 

(c ' est-a-dire des sous-elements d ' information ) interessant 1 ' utilisateur . 

De plus, une base de donnees contextuelle est creee a 1* intention de 

1 ' utilisateur qui comprend les elements collectes, des fichiers associes 

auxdits elements et des meta-donnees incluant des mots-cles associes aux 

elements. La base de donnees contextuelle est interrogee aux fins de 

generer un historique contextuel en reponse a des selections effectuees 

par 1 ' utilisateur ou a des mots entres par un utilisateur. Cet historique 

contextuel est utilise pour ameliorer la recherche et pour selectionner 

des annonces publicitaires ciblees en vue de leur presentation a 

1 ' utilisateur sur la base des informations actives de 1 ' utilisateur . 
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Detailed Description 

request to the Context Database Manager 330 to search the Context 
Database to find occurrences of the words in the meta-data and to 
retrieve items or collections associated with the words. If any items 
or collections are associated with the words in the search string, the 
Search Enhancer requests a Context Summary for each item or 
collection from the Context Summarizer. 

The Search Enhancer creates a Search Summary from the Context Summaries 
which is sent along with the user's selected search ... searched 826 for 
occurrences of the user's search words. If no occurrences have been 
found, the method terminates 827 in this embodiment. Otherwise, the 
items and collections having the user 's search words 830. For each 
item or collection , a Context Summary is created 832. Weights of 
keywords that appear more than once in all of the Context Summaries are 
summed 834 . 

Whether the user clicked on an item or collection , or entered keywords, 
a Search Summary is generated based upon the results of the Context 
Summarizing by selecting 836 up to a maximum number of keywords, N, those 
huvinq the highest... 
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Full text Word Count: 151011 
English Abstract 

The present invention is provided for comparison shopping by utilizing a 
customer's profile to prioritize the features of a group of similar, 
competing products. First, a customer's profile is developed. This 
profile may be developed from many sources including customer input, 
customer buying habits, customer income level, customer searching habits, 
customer profession, customer education level, customer's purpose of the 
pending sale, customer's shopping habits, etc. Next, the customer selects 
multiple, similar items, i.e. products or services to compare. Finally, a 
comparison table is presented which prioritizes the features in 
accordance with the customer's profile. 

French Abstract 

La presente invention concerne un achat par comparaison grace a 
1 ' utilisation d'un profil consommateur pour etablir des priorites dans 
les ca racterist iques d'un groupe de produits analogues en concurrence. 
D'abord on elsbore un profil consommateur . Ce profil peut etre elabore a 
partir de plusieurs sources, y compris une entree de donnees du 
f;onsornma teur , les habitudes d' achat du consommateur, le revenu du 
- r:so:ri:naceur, les habitudes de recherche du consommateur, la profession 
:: • .:,.^n;iia'ceur ( le niveau d'education du consommateur, les attentes du 
\- teur pour la vente en cours, les habitudes d 1 achat du 
:.s" mmateur, etc. Ensuite, le consommateur selectionne plusieurs 
L-cies analogues, c.-a-d. des produits ou des services afin de les 
:-;iuparer. Enfin, un tableau de comparaison produit etablit des priorites 
de caracteristiques en fonction du profil du consommateur. 
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Detailed Description 

. . . indicia coding the components of the system in order to show which of 
the components has services and products that can be provided. In 
particular, referring to Figure 1G, operation 46 determines the 
organization and components of an existing network framework. A database 
is also created which includes a compilation of... 



...gender or some other criteria. In operation 47b, a sales program is 
tailored to appeal to the target market by selecting only specific 
components having products or services likely to be purchased by the 
target market. Then, in operation 47c, the products or services 
related to the chosen components are chosen to be offered for sale. 

A pictorial representation of the existing network framework and a 
plurality of components of... 

17/5, K/41 (Item 30 from file: 349) 

DIALOG { R) File 34 9: PCT FULLTEXT 

(c) 2004 WIPO/Univentio. All rts. reserv. 

00736837 

MULT I -DOCUMENT SUMMARIZATION SYSTEM AND METHOD 
SYSTEME ET PROCEDE DE RESUME POUR PLUSIEURS DOCUMENTS 

Patent Applicant /Assignee : 

THE TRUSTEES OF COLUMBIA UNIVERSITY IN THE CITY OF NEW YORK, 116th Street 
and Broadway, New York, NY 10027, US, US (Residence), US (Nationality), 
(For all designated states except: US) 
Patent Applicant / Inventor : 

MCKEOWN Kathleen R, 20 Prospect Road, Wayne, NJ 07470, US, US (Residence) 

, US (Nationality), (Designated only for: US) 
iiARZILAY Regina, 548 Riverside Drive, Apt. 4B, New York, NY 10027, US, US 
Residence), US (Nationality), (Designated only for: US) 
Rep resen t a t i ve : 

IPiS'j Henry, Baker Botts, LLP, 30 Rockefeller Plaza, New York, NY 
: rj J 12-0228, US 

[r,iien;: and Priority Information (Country, Number, Date): 

Patent: WO 200049517 A2 20000824 (WO 0049517) 

Application: WO 2000US4118 20000218 (PCT/WO US0004118) 

Priority Application: US 99120659 19990219 

Designated States: AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK 
DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR 
IjS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ 
TM TR TT TZ UA UG US UZ VN YU ZA ZW 

;EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE 

'OA) BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW SD SL SZ TZ UG ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Main International Patent Class: G06F-017/10 
International Patent Class: G06F-017/27; G06F-015/00 
Publication Language: English 
Filing Language: English 
Fulltext Availability: 

Detailed Description 

Claims 

Fulltext Word Count: 4292 
English Abstract 

A summary for a collection of related documents can be generated by 
extracting phrases from the documents which include common focus 
elements. Phrase intersection analysis is then performed on the extracted 
phrases to generate a phrase intersection table, where identical or 
equivalent phrases are identified. Temporal processing on the phrases in 
the phrase intersection table is performed to remove ambiguous time 
references and to sort the phrases in a temporal sequence. Sentence 
generation is then used to combine the phrases in the phrase intersection 
table into a coherent summary. 

French Abstract 

L' invention concerne un resume de plusieurs documents connexes, qui 
repose sur 1 ' extraction, dans les documents, de phrases comprenant des 
elements d'interet conunun . On soumet lesdites phrases a une analyse 
d ' intersection de phrase pour etablir une table d ' intersect ion de phrase, 
ce qui permet d r identifier les phrases identiques ou equivalentes . Le 



craitement temporel auquel on soumet ensuite les phrases dans cette table 
permec d'eliminer les references de temps ambigues et de trier les 
phrases selon une sequence nemporelle. Enfin, une fonction de generation 
de phrase permet de combiner les phrases de ladite table en un resume 
coherent . 
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Detailed Description 

... phrase divergence processing {step 130), which compares selected 

phrases tor differences. Phrase divergence may indicate a critical change 

.: he course of events through a set of related documents and 

■. worthy of inclusion in a summary . For example, a collection 

articles regarding a plane crash could begin with a focus on the 
: i.-rs as "survivors" and later refer to "casualties , 10 

;ns !" "bodies" and the like, which signify a turning point in the 
i-; c ;us described by the documents. WordNet can also be used... 

. . . that it is first reported, a time stamp can be applied to the selected 
phrases based on the earliest occurrence of the phrase in the collection 

of documents (step 405) . In certain cases, phrases may include 
ambiguous temporal references, such as today, yesterday, etc. In this 
case, such ambiguous references can be replaced. . . 
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Fulltext Word Count: 23526 
English Abstract 

A system (10) for conducting surveys to voters in multiple different 
languages and registering voters is provided over a network (20) , such as 
the internet. The system includes a programmed computer system 
representing network server (12) which provides an addressable voting 
site (22) and registration site (24) on the network, and a database (15) 
storing voting information for building surveys in multiple languages and 
s -'cording the results of the surveys, and registration information for 
halloing registration questionnaires and recording the results of the 
questionnaires. In response to a computer (18) of a voter connecting to a 
server (12), the network server determines the language and country of 
the voter, and dynamically constructs the survey in voter's language in 
accordance with the voting information stored in the database. The answer 
from the voter is received and added to the database tallying the totals 
for each response answered for each question for the country of the 
voter. A summary of results of the survey is constructed and transmitted 
to the voter's computer. 

French Abstract 

Ce systeme (10) est destine a la conduite de sondages, dans plusieurs 
langues, aupres de votants, et a 1 ' inscription de votants sur un reseau 
(20), tel que l'Internet. Ce systeme comprend un systeme informatique 
programme representant le serveur (12) du reseau, lequel constitue un 
site de vote (22) adressable ainsi qu ' un site d ' inscription (24) sur le 
reseau, et une base de donnees (15) conservant, d'une part, des 
informations de vote aux fins d ' etablissement s de sondages en plusieurs 
langues et de memorisation des resultats des sondages, et d'autre part 
des informations d ' inscription aux fins de constitution de questionnaires 
d ' inscription et de memorisation des resultats des questionnaires. En 
reponse a un ordinateur (18) d'un votant se connectant sur un serveur 
(12), le serveur du reseau determine la langue et le pays du votant et 
conscruit de maniere dynamique le sondage, dans la langue du votant, en 
fonction des informations de vote conservees dans la base de donnees. La 
reponse envoyee par le votant est recue et ajoutee a la base de donnees, 
laquelle effectue les totaux pour chaque reponse a chaque question 
roncernant le pays du votant. Un resume des resultats du sondage est 
construit et transmis a 1 'ordinateur du votant. 

Legal Status (Type, Date, Text) 

Publication 20000810 Al With international search report. 

Examination 20001109 Request for preliminary examination prior to end of 

19th month from priority date 

Fulltext Availability: 
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Detailed Description 

where histogram 153a and percentages 153b are below question 29a and 
response set 29b, and histogram 153c percentages 153d are below question 
29c and response set 29d. Other graphics showing summary , such as 
pie charts, may similarly be used to show the results. 

Referring back to FIG. 14, if the voter selects a comparison for a 
different country via the results forin page (step 154), a results form 
page . . . 
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English Abstract 

The invention includes an electronic processing system, a method and a 
computer readable storage device for generating a serendipity-weighted 
recommendation output set to a user based, at least in part, on a 
serendipity function. The system includes a processing system to receive 
user item preference data and community item popularity data. The 
processing system is also configured to produce an item recommendation 
set from the user item preference data, produce a set of item serendipity 
concrol values in response to the serendipity function and the community 
item popularity data, and combine the item recommendation set with the 
set of item serendipity control values to produce a serendipity-weighted 
and filtered recommendation output set. The method includes receiving 
item preference data and community item popularity data. The method 
further includes producing an item recommendation set from the user item 
preference data, using the processing system, and generating a set of 
item serendipity control values in response to the community item 
popularity data and a serendipity function, also using the processing 
system. The method also includes combining the item recommendation set 
and the set of item serendipity control values to produce a 
serendipity-weighted and filtered item recommendation output set, using 
the processing system. The computer readable storage device, has a set of 
: : - . : r r * rr instructions physically embodied thereon, executable by a 
- :'•:•••::<•:• , lo perform e method similar to that just described. 

: * ■. " A: s: ract 

: ' .r.vencion porte sur un systeme de traitement elect ronique, un procede 
ct une memo! re pouvant etre lue par 1 ' ordina teur , ce systeme permettant 
de generer un ensemble de sorties de recommandat ions ponderees par 
serendipite et adaptees a un utilisateur sur la base, au moins en partie, 
d'une fonction de serendipite. Le systeme comprend un systeme de 
traitement qui recoit des donnees de preference d' articles d ' un 
utilisateur et des donnees de popularite d' articles de communaute. Le 
systeme de traitement est egalement configure pour produire un ensemble 
de recommandations d 'articles a partir des donnees de preference 
d'articles d'un utilisateur, un ensemble de valeurs de commande de 
serendipite d'articles en reponse a la fonction de serendipite et aux 
donnees de popularite d'articles de communaute, et combiner 1' ensemble de 
recommandations d'articles avec 1' ensemble des valeurs de commande de 
serendipite d'articles pour produire un ensemble de sorties de 
recommandations filtrees et ponderees par serendipite. Le procede 
consiste a recevoir des donnees de preference d'articles d'un utilisateur 



et les donnees de popularite d' articles de communaute . Le procede 
consiste egalement a generer un ensemble de recommandat ions d' articles a 
partir des donnees de preference d'articles d'un utilisateur, a l'aide du 
systeme de traitement , et generer un ensemble de valeurs de commande de 
serendipite d'articles en reponse aux donnees de popularite d'articles de 
communaute et a une fonction de serendipite, egalement a l'aide du 
systeme de traitement. Le procede consiste egalement a combiner 
1* ensemble de recommandat ions d'articles et 1' ensemble des valeurs de 
commande de serendipite d'articles pour produire un ensemble de sortie de 
recommandations d'articles filtrees et ponderees par serendipite, a 
l'aide du systeme de traitement. La memoire pouvant etre iue par 
l'ordinateur comporte un ensemble d ' instructions de programme 
physiquement incorporees, pouvant etre executees par l'ordinateur, de 
facon a realiser un procede similaire au procede precite. 

Fu ■ .1 t ex t Availability: 
ailed Description 

: - \ j . ..ed Description 

to receive applicable data that includes user item preference data and 
community item popularity data. The processing system is also configured 
to produce an item recommendation set from the user item preference 
data, produce a set of item serendipity control values in response to the 
serendipity function and the community item. . . 

...applicable data that includes user item preference data and community 
item popularity data. The method further includes producing an item 
recommendation set from the user item preference data, generating a 
set of item serendipity control values in response to the community 
item popularity data and a serendipity function, and combining the item 
recommendation set and the set of item serendipity control values 
to produce a serendipity- weighted and filtered item recommendation 
output set . 

The above summary of the present invention is not intended to 
describe each illustrated embodiment or every implementation of the 
present 1 0 invention. Other features of the... 

... and 

filtered recommendations to a user according to one embodiment of the 

present 

1 nvent ion ; 

F:'-. ? illustrates a system for generating serendipity-weighted and 
; . ,• ered recommendations to a user according to another embodiment of 
' \t :<:osent 
. :. vp;v. : on ; 

~:G. A illustrates an example of a universe of users, including a 
customer, and . . . 
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A SCALABLE SYSTEM FOR CLUSTERING OF LARGE DATABASES HAVING MIXED DATA 
ATTRIBUTES 

SYSTEME A ECHELLE VARIABLE PERMETTANT LE GROUPEMENT DE GRANDE S BASES DE 
DONNEES A ATTRIBUTS DE DONNEES MIXTES 
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Claims 

Fuiltext Word Count: 14550 
English Abstract 

A scalable clustering algorithm (12) accesses database (10) of records 
having attributes or data fields of both enumerated discrete and ordered 
values and brings a portion of the data records into a rapid access 
memory. A cluster model for the data includes a table of probabilities 
(160) for the enumerated, discrete data fields of the data records. The 
cluster model for data fields that are ordered comprises a mean and 
spread of the cluster. The cluster model is updated from the database 
records brought into the rapid access memory. Some of the database 
records in the rapid access memory are summerized and stored within the 
rapid access memory. A criteria is evaluated to dermine if further data 
should be accessed from the database to further cluster data records in 
the database. Additional database records in the database are accessed 
and brought into the rapid access memory for further updating of the 
-luster model. 
Ft "!ich Abs t ract 

;/ invention concerne un algorithme de groupement a echelle variable (12) 
-:ui perrnet d'acceder a une base de donnees (10) dans laquelle les 
vnreqist remencs ont des attributs de champs de donnees dont les valeurs 
som a la fois discretes, enurnerees, et ordonnees. L' algorithme perrnet 
d'introduire une partie des donnees dans une memoire a acces rapide . Un 
modele de groupement pour les donnees est presente, qui comprend une 
table de probabilites (160) correspondant aux -champs de donnees 
discretes, enurnerees, des enregis t rement s de donnees. Le modele de 
groupement pour les champs de donnees ordonnees fournit une indication de 
moyenne et de variabilite pour le groupement. Le modele est actualise a 
partir des enregis t rement s introduits dans la memoire a acces rapide. 
Certains enregist rement s introduits dans la memoire a acces rapide sont 
resumes et stockes dans ladite memoire. L'evaluation d'un critere perrnet 
de determiner s'il convient d'acceder a des donnees supplementaires 
depuis la base de donnees pour poursuivre le groupement d ' enregistrements 
dans ladite base de donnees. Ensuite, on accede a des enregistrements 
supplementaires dans la base de donnees, afin d'introduire ces 
enregistrements dans la memoire a acces rapide et de poursuivre ainsi 
1 ' actualisation du modele de groupement. 

Fuiltext Availability: 
Claims 

Claim 

... into a 

rapid access memory; 

b) initializing a cluster model that characterizes the data within the 
database wherein the cluster model includes a table of probabilities 
for the enumerated or discrete data attributes of the data records for 
each cluster of a multiple number of clusters that make up the cluster... 

...calculating a weighted sum of the data records brought into the rapid 
access memory and a weighted sum for data records already summarized in 
the cluster model. 

3 The method of claim I wherein the step of updating the cluster model 
includes the step of adjusting a data structure of ordered attribute mean 
and covariance values by calculating a weighted sum of the mean and 
covariance values of 



database records brought into che rapid access memory and the mean and 
covariance values for records already summarized in the cluster 
model . 

4 The method of claim I wherein the step of updating the cluster model 
includes adjusting che ordered attribute meah and spread values and the 
cable of discrete attribute probabilities for a cluster by calculating a 
weighted sum of the mean and covariance values and probability values 
of database records brought into the rapid access memory and the mean 
and covariance values and probability values for records already 
summarized in the cluster model. 

5 The method of claim I wherein both the ordered and the discrete 
attributes are assigned a confidence interval and wherein the summarizing 
step... point is suitable for summarization 

35 

SUBSTITUTE SHEET (RULE 26) 

by comparing the probability that a data point belongs to a cluster with 
a threshold probability value. 

- '"he method of claim 5 wherein the step of summarizing the data base 
:.-->:ris includes the step of performing a non-scalable clustering... 
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Detailed Description 

Claims 

Fulltext Word Count: 37125 
English Abstract 

A system and method for electronically exchanging information related to 
telecommunication services (20) includes separating data representing the 
information to be exchanged into predefined segments corresponding to 
telecommunication services (124), associating a segment identification 
code with each segment, and grouping each segment identification code 
with corresponding data (122). The system and method also include 
concatenating the segment identification codes and associated data 



according to a predefined sequence Co form an electronic transaction 
{130) and transmitting the electronic message to a telecommunications 
wholesaler or reseller (132). Preferably, the information is exchanged 
over a TCP/IP connection (136) in an interactive, transaction-based 
exchange (138) . 

French Abstract 

L' invention porte sur un systeme et un procede d'echange electronique 
d ' in forma t ions relatives a des services de telecommunications (20) 
consistant: a separer les donnees representant les informations a 
echanger en segments predefinis {124) correspondant aux services de 
celecoramunica tions , a associer un code d ' ident i f icat ion de segment a 
chacun des segments, puis a reunir chacun desdits codes avec les 
informations correspondant es (122). Le systeme et le procede consistent 
de plus a concatener lesdits codes et les donnees associees en une 
sequence predefinie pour former une transaction electronique (130), puis 
a cransmettre le message electronique a un grossiste ou a un detaillant 
(132). L'echange (138) des informations se fait de preference par 
1 * interrnediaire une connexion TCP/IP (136) sous forme interactive et 
cransactionnnelle. 

Fulltext Availability: 

Detailed Description 
Detailed Description 

Indicator M ID 1/1 

Code to indicate whether data enclosed by this interchange envelope is 

test or 

production 

T/Test Data 

P/Production Data 

Refer to 003030 Data Element Dictionary for acceptable code values. 

Must Use ISA16 115 Component Element Separator M AN 1/1 
This is a field provides... 

. . . Syntax Notes . 

Semantic Notes. 

Comments: I A functional group of related transaction sets, within the 
scope of XI 2 standards, consists of a collection of similar 
transaction sets enclosed by a functional group header and a functional 
group trailer. 

Data Element Summary 

Ret. Data 

Des . Element Name 
Attributes 

Must Use GS01 479 Functional Identifier Code M ID 2/2 

Code idenfilSing a group of application related transaction sets 

CA 
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Claims 

Fulltext Word Count: 5507 
English Abstract 

A method for grouping related images captured with an image capture 
device (114) includes identifying a first group, the first group 
distinguishing at least one first image of an image capture method 
defined in the image capture device; and identifying a second group, the 
second group distinguishing at least one second image of one or more 
designated image characteristics, wherein the first and second groups 
provide structured relationships among images. The first group further 
includes a natural group and the image capture method further includes a 
time lapse capture. The second group further includes programmed groups. 
A system (110) includes a digital image capture device (114) capable of 
capturing and processing digital image data, and a central processing 
unit (118) within the digital image capture device. The central 
processing unit further coordinates identification of a first group and 
identification of a second group, wherein the first and second groups 
provide structured relationships among images. 

French Abstract 

La presente invention concerne un procede de groupement d' images 
connexes, saisies au moyen d'un dispositif de saisie d' images permettant 
a" identifier un premier groupe, lequel groupe distingue au moins une 
premiere image d'un procede de saisie d'images, defini dans le dispositif 
de saisie d'images. Ledit procede consiste egalement a identifier un 
second groupe, lequel groupe distingue au moins une seconde image 
comportant une ou plusieurs caracterist iques d'images designees. Le 
premier et le second groupes fournissent des relations structurees entre 
des images. Le premier groupe comporte en outre un groupe naturel et le 
procede de saisie d'images integre une procedure de saisie de temps 
ecoule. Le second groupe comporte des groupes programmes, Un systeme 
comprend un dispositif de saisie d'images numerique, le dispositif de 
saisie d'images numerique capable de saisir et traiter les donnees 
d'images numeriques et une unite centrale integree dans le dispositif de 
saisie d'images numerique. L ' unite centrale coordonne 1 ' identification 
d'un premier groupe et 1 ' ident i f ication d'un second groupe, le premier et 
le second groupes fournissant des relations structurees entre les images. 



Fulltext Availability: 
Detailed Description 

• * -i.r-d description 

... ■ r • are capable of performing specific types of image captures. These 
>jpture cypes include time lapse 
if .ires and burst captures. Time lapse captures typically refer to a 
r og rammed capture sequence of a particular image over a set time period, 
while bursts typically refer to a rapid sequence of image captures. . . 

...accessing the image data. Further, attempts to manipulate and access 
these related images as sets are difficult. 

Accordingly, a need exists for easily identifiable image groups of 
related images , including user -created groups . 

SUMMARY OF THE INVENTION 

The present invention meets these needs and provides a method and 
system for grouping related images captured with an image capture 



device. In a method aspect, che method includes identifying a first 
group, the first group distinguishing at least one first image of an 
image capture method defined in the 

image capture device, and identifying a second group, the second group 
distinguishing at least one second image of one or more designated 
image 

characteristics, wherein the first and second groups provide structured 
relationships among images . The first group further includes a 
natural group and the image capture method further includes a time 
lapse capture. The second group further includes programmed groups. 

In a system aspect, the system includes 
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Claims 
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F.nqlish Abstract 

This invention relates to customized electronic identification of 
■>-s liable objects, such as news articles, in an electronic media 
• •hvuonment, and in particular to a system that automatically constructs 
pn; h a "target profile" for each target object in the electronic media 
based, tor example, on the frequency with which each word appears in an 
article relative to its overall frequency of use in all articles, as well 
as a "target profile interest summary" for each user, which target 
profile interest summary describes the user's interest level in various 
types of target objects. The system then evaluates the target profiles 
against the users 1 target profile interest summaries to generate a 
user-customized rank ordered listing of target objects most likely to be 
of interest to each user so that the user can select from among these 
potentially relevant target objects, which were automatically selected by 
this system from the plethora of target objects that are profiled on the 
electronic media. Users' target profile interest summaries can be used to 
efficiently organize the distribution of information in a large scale 
system consisting of many users interconnected by means of a 
communication network. Additionally, a cryptographically-based pseudonym 
proxy server is provided to ensure the privacy of a user's target profile 
interest summary, by giving the user control over the ability of third 
parties to access this summary and to identify or contact the user. 



French Abstract 



La presente invention concerne un systeme d ' identification electronique 
personnalisee d'objets recherches, tels que des articles de presse, dans 
un environnement de supports elect roniques . L ' invention concerne en 
parciculier un systeme qui construit, d'une part un "profil cible" pour 
chaque objet dans le support electronique, en partant, par exemple, de la 
frequence de chaque mot dans un article par rapport a sa frequence 
ri'ensembie pour tous les articles, et d'autre part un "resume d'interets 
de profils cibles", concernant chaque utilisateur, et decrivant le niveau 
d'interet de 1 ' u t ilisa teur par rapport a differents types d'objets 
cibles. Le systeme compare ensuite les profils cibles avec les resumes 
d'interets de profils cibles des utilisateurs afin de generer une liste, 
classee selon les desiderata de 1 ' u t i lisateur , et concernant les objets 
cibles les plus susceptibles de presenter de l'interet pour chacun des 
utilisateurs. Cela permet a chaque ucilisateur de faire un choix parmi 
les objecs cibles even t uel iement interessants qui ont ete selectionnes 
a !to;nac iquement par ce systeme a partir d'une quantite plethorique 
':■ objecs pour lesquels il existe un profil sur le support electronique. 
I.es resumes d'interets de profils cibles permettent d' organiser 
efficacement la distribution de 1 ' information dans un systeme a grande 
echelle rassemblant un grand nombre d ' utilisateurs interconnectes entre 
eux par un reseau de communication. De plus, le systeme dispose d'un 
serveur pseudonyme d' interface a vocation cryptographique assurant la non 
divulgation du resume d'interets de profils cibles d'un utilisateurs, et 
donnant a 1 ' utilisateur la possibilite d'autoriser des tiers a avoir 
acces a son resume d'interets de profils cibles et d' identifier 
1 * utilisateur ou de prendre contact avec lui . 

Fulltext Availability: 
Detailed Description 

Dp 'jailed Description 

. . . rhe advertiser is willing to pay 

A further use of the capabilities of this system is to manage a user's 
investment portfolio. Instead of recommending articles to the user, the 
system recommends target objects that are investments. As illustrated 
above by the example of stock market investments, many different 
attributes can be used together to profile each investment. The user's 
past investment behavior is characterized in the user 's search profile 

set or target profile interest summary , and this information is 
used to match the user with stock opportunities (target objects) 
similar in nature to past investments. The rapid profiling method 
described above may be used to determine a rough set of preferences for 
new users . Quality attributes used in this system can include 
negatively weighted attributes, such as a measurement of fluctuations in 
dividends historically paid by the investment, a quality attribute that 
would have a strongly negative weight for a conservative investor 
dependent on a regular flow of investment income. Furthermore, the user 
can set filter parameters so that the system can monitor stock prices 
and automatically take certain actions, such as placing buy or sell 
orders, or e-mailing... 
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ABSTRACT EP 1338983 A2 

A document summarization apparatus or method summarizes an electronic 
document written in a natural language, and generates an appropriate 
summary depending on a user's knowledge. The document summarization 
apparatus according to the present invention includes, for example, a 
summary readability improvement unit, and a summary generation unit. In 
the document to be summarized, the summary readability improvement unit 
distinguishes user known information already known to a user, and 
information known through an access log regarded as already known to a 
user based on a document previously presented to the user when a summary 
is generated, from other information than these two types of information, 
and selects the important portions of the document to be summarized. The 
summary generation unit generates the summary of the document to be 
summarized based on the selection result of the summary readability 
improvement unit. Thus, a summary can be generated depending on the 
knowledge level of a user. 
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.SPECIFICATION the process shown in FIG. 13. This process is sequentially 
performed for each of the sentences retrieved by the sentence dividing 
process . 

The dependence between document components is set for the sentence 
and phrases (subordinate sentences and phrases) which themselves have low 
readability but can be made more readable by taking another related 
sentence or phrase together into a summary . The dependence is set for 
the following document components. 



(1) A subordinate clause in a sentence 

The dependence of a subordinate clause is set on a main clause, so that 
the main clause... 
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ABSTRACT EP 969388 Al 

A method for learning a user preference for a desired image, the method 
comprises the steps of using either one or more examples or 
counterexamples of a desired image for defining a user preference; 
extracting a relative preference of a user for either one or more image 
components or one or more depictive features from the examples and/or 
counterexamples of desired images; and formulating a user subjective 
definition of a desired image using the relative preferences for either 
image components or depictive features. 
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..SPECIFICATION form the set of images desired by the. user. The image 
clusters can be dynamically adapted based on user supplied examples of 



desired or undesired images . This process of modifying clusters is very 
cime consuming for large databases. Another system called NETRA (W.Y. Ma, 
"NETRA: A Toolbox for Navigating Large Image Databases", Ph.D. 
Dissertation, UCSB, 1997) also utilizes feature similarity-based image 
clusters to generate user preference-based query response. This system 
has restricted feature-based representations and clustering schemes. The 
main drawbacks of both these system are (i) the database... 
.deleted from the database without complete database re-clustering); (ii) 
usually, a large number of positive and negative examples need to be 
provided by the user in order for the system to determine image 
clusters tat correspond to the set of desired images . 

SUMMARY OF THE INVENTION 

The present invention proposes a general framework or system for user 
preference-based query processing. This framework overcomes the 
shortcomings of the existing approaches to capture and utilize user 
preferences for image retrieval. 

An obj ect of . . . 
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INTERNATIONAL PATENT CLASS: G06F-017/30 

...SPECIFICATION of a ''current candidate" and its word count (to be 
described). At block 11, the system is also initialized to set the 
"■"■'.] rr & nt candidate" and corresponding "word count" to none. 

A' t*p 12, the system sets the summary record field name to the 
: • \-rii.jue field name in the summary structure database starting from 
■ r>- first, and at 13 retrieves from the summary candidate database the 
rn summary candidate (selected candidate) also starting from the first 
having a field name matching the summary record field name that has 
just been set . For example, the first summary record field name 
might be "Category". The first summary candidate with a field name 
category might be "Financial" having the criteria keywords noted above. 

Next , the . . . 
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ABSTRACT EP 713185 Al 

A method and system for displaying names of data files in a collection 
of data files represented by a corresponding symbol. According to one 
embodiment of the present invention, a user may display a listing of 
subroutine library files required to execute a particular subroutine. 
In such an embodiment, the user may enter the subroutine name as the 
symbol of interest and the system would display the library file 
containing that subroutine as well as those data files that contain 
subroutines called by that subroutine of interest. The present invention 
uses a transitive closure technique to traverse a data structure 
generated from a database and retrieve the data file list. The 
transitive closure technique enables the use of a compact database that 
contains only the data file names, corresponding symbol names, and 
symbols names of only data files for each data file that are directly 
related to that data file. (see image in original document) 
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. SPECIFICATION 25, No. 5, pp. 16-25 (1991), which are incorporated by 
reference herein. 

Therefore, a need exists for a convenient method for displaying the 
data file names of a particular data file collection represented 
by a corresponding symbol . 

Summary of the Invention 
A database dependency resolution method and system in accordance with 
the present invention enables a user to automatically obtain a list 
identifying . . . 
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ABSTRACT EP 610760 A2 

A document detection system capable of detecting a desired document 
from a large number of documents easily and accurately in which the user 
can make a judgement concerning the appropriateness of the detection 
result quickly. In the system, those documents which contain a semantic 
structure of a detection command containing natural language expressions 
entered by a user are detected. Also, the keywords of each document can 
bi extracted from the summary of each document and those documents whose 
bywords match with detection keywords specified by a user can be 
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.SPECIFICATION from the first character in the summary display to the 
selected character. 

Next, at the step 4003, the obtained character position is converted 
into the summary sentence number. This conversion can be carried out by 
using a summary sentence table shown in Fig. 64, in which the 
corresponding character positions and the sentence number in the original 

document are enlisted for each displayed summary sentence number . 
Thus, the summary sentence number can be obtained by sequentially 
comparing the obtained character position with the character position 
ranges in this summary sentence table to find out the character position 
range containing the obtained character... 

.In a case the character in "3. System function" is selected, the 
character position is within the range of 95 to 102, so that the 
summary sentence number can be determined as "5", and the 
corresponding original document sentence number can be determined as 
"16" according to the summary sentence table of Fig. 64. 

Then, at the step 4005, the position of the obtained original document 
sentence number is determined, and set to the original document 
display pointer. Here, the position of the obtained original document 
sentence number can be determined by sequentially comparing the obtained 
original document sentence number with... 

.structure of the original document to find out the corresponding 
posi t ion . 

Finally, at the step 4006, the original document is displayed according 
to the original document display pointer set at the step 4005. 

As a concrete example, Fig. 65 shows the original document display for 
i he original document corresponding to the summary shown in... 

.SPECIFICATION from the first character in the summary display to the 
selected character. 

Next, at the step 4003, the obtained character position is converted 
into the summary sentence number. This conversion can be carried out by 
using a summary sentence table shown in Fig. 64, in which the 
corresponding character positions and the sentence number in the original 

document are enlisted for each displayed summary sentence number . 
Thus, the summary sentence number can be obtained by sequentially 
comparing the obtained character position with the character position 
ranges in this summary sentence table to find out the character position 
range containing the obtained character... 

.In a case the character in "3. System function" is selected, the 
character position is within the range of 95 to 102, so that the summary 

sentence number can be determined as "5", and the corresponding 
original document sentence number can be determined as "16" according 
to the summary sentence table of Fig. 6<3 . 

Then, at the step 4005, the position of the obtained original document 

sentence number is determined, and set to the original document 
display pointer. Here, the position of the obtained original document 
sentence number can be determined by sequentially comparing the obtained 
original document sentence number with... 



.structure of the original document to find out the corresponding 
pos i t ion . 

Finally, at the step 4006, the original document is displayed according 
co the original document display pointer set at the step 4005. 

As a concrete example, Fig. 65 shows the original document display for 
the original document corresponding to the summary shown in. . . 
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ABSTRACT EP 265083 Al 

A multimode video merchandiser system utilizes two levels of inductive 
learning to derive rules for selecting the sequence in which images of 
products stored on a videodisc are presented on a video monitor to a 
user. The first level of inductive learning generates rules from market 
survey based, consumer profile attributes assigned to items selected by 
previous users to determine the profile of the consumer most likely to be 
using the system at any given time, and to present the items in a 
sequence most likely to appeal to such a user. The second level of 
inductive learning utilizes a set of product characteristic attributes 
assigned to items selected by the current user to determine that user's 
preferences, and to modify the sequence of presentation to display first 
those items possessing the preferred characteristics. 
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. . . SPECIFICATION the pricing comparison tells if there are multiple models 
of an item, and whether an item is on sale. Following the price 
information, the functional comparison of the items is prepared. For 
this comparison , items with similar functions are grouped 
together and a summary of the results is printed on the screen. 
Finally, the user can see a detailed description/ comparison of the 
items . Again, like items are grouped , but this time the features are 
highlighted, first as to how the products are similar and then as to how 
they differ. If the items... 
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Detailed Description 

Claims 

Fulltext Word Count: 22335 
English Abstract 

The present invention relates to systems and methods for interactively 
searching a database (905) in such a manner that it is quick and easy to 
search, drill down, drill-up and drill across a data collection (905) 
presenting the user with summary information using multiple independent 
hierarchical category taxonomies (915) of the data collection (905). The 
present invention also relates to business methods associated with 
providing information to users based on the searching systems and 
methods, and the revenue stream attached thereto. The present invention 
also relates to retrieving information from a database based on content 
aggregation, management and distribution. 

French Abstract 



L ' invention concerne des systemes et des procedes pour faire des 
recherches interactives dans une base de donnees (905) de maniere rapide 
et aisee en accedant aux informations en mode descendant, ascendant et 
craversant a I'interieur d'une collection de donnees (905); des 
informations agregees soar soumises a 1 ' utilisateur grace a des 
classifications de categories hierarchiques multiples independantes (915) 
de la collection de donnees (905) . La presente invention concerne aussi 
des procedes d'entreprise associes a la communication des informations 
aux uciiisateurs sur la base des systemes et procedes de recherche, et un 
flux de revenus qui y est lie. L' invention concerne egalement la 
recuperation d ' inf orma t ions a partir d'une base de donnees sur la base de 
I * sgrega t ion, la gestion et la distribution du contenu. 

Legal Status (Type, Date, Text) 

Publication 2001101! Al With international search report. 

Kx*:\i na t ion 20020103 Request for preliminary examination prior to end of 

ISth month from priority date 

Main international Patent Class: G06F-017/60 
Full text Availability: 
Detailed Description 

Detailed Description 

. . . interactively searching a database in such a manner that it is quick 
and easy to search, drill down, drill-up and drill across a data 
collection presenting the user with summary information using 
multiple independent hierarchical category taxonomies of the data 
collection. The present invention .also relates to business methods 
associated with providing information to users based on the searching 
systems and methods, and the revenue stream attached thereto. The present 
invention ... more importantly, to disregard all other irrelevant 
in formation . 

For example, if a user enters the search term "wheel alignment," the 
system would search all the records in the data collection that 
contained the term "wheel alignment." Rather than returning a long list 
of 1,701 search results that satisfy the user's query, the present 
invention provides the user with the categories that are associated with 
the remaining records and indicates how many records are associated 
with each category. This functionality assists the user to further 
refine his/her search and disregard the irrelevant information. 

These searched data collections provide users with summary 
information (categorized I 0 search results) about the data collection 
being searched. Users need not use pull-down menus or fill in any 
"required" fields to construct the parameters of their search (zip code, 
city, business category, etc... 
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Detailed Description 

Claims 

Fulltext Word Count: 13671 
English Abstract 

A system and method for actively marketing products and services to a 
user of a client computer such as over a network are disclosed. A product 
information database comprising product summary files that facilitate 
determination of presence or absence of products associated with the 
client computer, a marketing rule knowledge base (214) comprising 
opportunity rule files governing marketing opportunities, and an 
opportunity detection object for determination of marketing opportunities 
are utilized to determine active marketing opportunities and may be 
downloaded to the client computer from a service provider computer 
system. The opportunity detection object may comprise a scan engine, an 
importunity analysis engine (220), and a presentation engine which 

■ ] actively determine and present marketing information to the client 
computer user. The scan engine compares the client computer against the 
product information database to determine the configurations of the 
client computer and to generate a client computer inventory database 

(402). The opportunity analysis engine (220) analyzes the client computer 
inventory database (402) against the marketing rule knowledge base (214) 
and generates a list of marketing opportunities (404) for the client 
computer. The presentation engine analyzes the list of marketing 
opportunities (404) and provides marketing and/or other information 
regarding marketed products to the user. 

French Abstract 

Cette invention a trait a un systeme et a la methode correspondante 
permettant de proceder a une commercialisation active de produits et de 
services a 1 'intention d'un utilisateur d'un ordinateur client, sur un 
reseau notamment. On utilise une base de donnees d 1 information produit 
renfermant des fichiers de sommaires de produits facilitant la 
determination de la presence ou de 1' absence de produits associes a 
1 'ordinateur client, une base de connaissance de regie de 
commercialisation (214) renfermant des fichiers de regie d ' opportunite 
regissant les opportunites de commercialisation et un objet de detection 
d ' opportunite permettant de detecter des opportunites de 
commercialisation et ce, afin de determiner des opportunites de 
commercialisation active, tous ces elements pouvant etre telecharges dans 
1 'ordinateur client a partir d'un systeme informatique de prestation de 
:-:--*rv ices . L* objet de detection d ' opportunite peut comporter un moteur 
-:"i';-:n.l oration, un moteur d* analyse d * opportunite (220) et un moteur de 
;:osentation qui determine, col lect ivement , une information de 
'-ommercialisa t ion et la presente a 1 ' utilisateur de 1' ordinateur client. 
Le moteur d ' exploration etablit une comparaison entre 1 'ordinateur client 
et la base de donnees d ' inf orma t ion de produit afin de determiner les 
configurations de cet ordinateur client et de creer une base de donnees 
d'inventaire d'ordinateur client (402). Le moteur d'analyse d ' opportunite 
(220) analyse la base de donnees d'inventaire d'ordinateur client (402) 
par confrontation avec la base de connaissance de regie de 
commercialisation (214) et etablit une liste d ' opportunites de 
commercialisation (404) destinee a 1' ordinateur client. Le moteur de 
presentation analyse la liste des opportunites de commercialisation (404) 
et adresse a 1 ' utilisateur une information de commercialisation et/ou une 
autre information relative aux produits commercialises. 
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Claim 

. . . each product record comprises at least one of an existing product 

identification, an existing product category, and an existing product 

property of the detected product associated with the client computer. 

15 The method for marketing to the user of the client computer according 
co claim 12, wherein said product summary file is selected from the 
group consisting of a software product summary file and a hardware 

product summary file . 

16 The method for marketing co the user of the client computer 
according co claim 12, wherein said product signature of the product 
summary file is selected from the group consisting of an 
executable-type product signature, a registry-type product signature, 
an initialization-type product signature, a driver-type product 
signature, and a command-type product signature. 

3 1 

. The method for marketing to the user of the client computer... 
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English Abstract 

Multiple electronic commercials (ecommercials ) are automatically 
assembled for an advertising campaign based upon varying characteristics 
of the targeted prospects (30), and the prospects (30) are sent 
electronic commercials corresponding to their particular characteristics. 
The commercials are preferably transmitted (60) as executable files, some 
or all of which can be authenticated (50) . Preferred characteristics 
employed to produce the various commercials include age, sex, and income, 
which may be obtained from previous electronic commercials. The multiple 
commercials can differ in one or more components, preferably their video 
or audio clips. The automatic assembling of the multiple commercials 
preferably occurs in relatively close temporal proximity to their 
transmission (60). It is especially contemplated that at least 10% of the 
commercials are transmitted (60) to at least some of the targeted 
recipients (110) within 24 hours, and more preferably within 2 hours. 



French Abstract 

L' invention concerne des messages publicitaires elect roniques multiples 
(e-messages publicitaires) qui sont automat iquement rassembles pour une 
campagne publicitaire basee sur les differentes caracterist iques des 
clients cibles (30); ceux-ci (30) recoivent des messages publicitaires 
electroniques correspondan t s a leurs caracterist iques part iculieres . Les 
messages publicitaires sont, de preference, transmis sous forme de 
fichiers executables dont tout ou partie peut etre authentifie (50). Les 
ca racterist iques preferees employees pour produire les differents 
"■es.sages publicitaires comprennent l'age, le sexe et le revenu, elles 
:-euvenc etre obtenues grace aux messages publicitaires precedents. Les 
messages publicitaires peuvent differer par un ou plusieurs de leurs 
"rornposants , de preference leur spot video ou audio. L'assemblage 
automatique des messages publicitaires multiples se fait, de preference, 
dans un espace de temps rapproche par rapport a la transmission de ces 
messages (60). II a ete specialement etudie qu'au moins 10 % des messages 
publicitaires soient transmis (60) a au moins quelques clients cibles 
(110) dans les 24 heures, et de preference, dans les 2 heures . 
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Detailed Description 

. . . providers are acceptable in many ways, they are not acceptable for all 
messages . 

A ■cuiinon shortcoming of known ecommercials is their failure to adequately 

• iet individuals or small groups . Utilizing the same ecommercial 

• * ■ op large a group results in the commercial being relatively 

receive for a significant portion of the group. 



r 

Thus. . . 



. . . for new types of ecommercials and associated methods to overcome the 
deficiencies of known commercials and methods, particularly in regard to 
being able to target individuals and/or small groups . 

Summary of the Invention 
The present invention provides electronic commercials (ecommercials) and 



related methods in which a plurality of targeted prospects are selected 
for an advertising campaign, multiple commercials are automatically 
assembled for the campaign based upon varying. . . 
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English Abstract 

An effect method and apparatus for organizing and processing chunks of 
interrelated information (or "thoughts") using a digital computer is 
disclosed. The invention utilizes highly flexible, associative thought 
networks to organize and represents digitally-stored thoughts. A thought 
network specifies a plurality of thoughts, as well as network 
relationship among the thoughts. A graphical representation of the 
ihoughc network is displayed, including a plurality of display icons 
corresponding to the thoughts, and a plurality of connecting lines 
corresponding to the relationships among the thoughts. Each of the 
thought is associated with one or more software application programs, 
such as a word processing or spreadsheet utility. Users are able to 
select a current thought conveniently by interacting with the graphical 
representation, and the current thought is processed by automatically 
invoking the application program associated with the current thought in a 
transparent manner. Users can conveniently modify the thought network by 
interactively redefining the connecting lines between thoughts. In 
another aspect of the invention, attribute values are associated with the 
various thoughts of the network, and the network is searched to identify 
a subset of the thoughts having attribute values equal to a desired set 
of values. Further aspects of the invention include techniques for 
scheduling selected thoughts of the network for desired operations at 
specified times, and storing timing and usage statistics in order to 
preserve a history of the processing tasks performed on each thought. 



French Abstract 

L ' invention porte sur un procede et un appareil permettant d' organiser et 
de traiter des segments d ' informations interdependantes (ou <= concepts 
>=) a i'aide d'un ordinateur. L'invention met en oeuvre des reseaux de 
concepts associatifs, extremement flexibles, pour organiser et 
representer des concepts enregistres numer iquement . Un reseau de concepts 
determine une pluralite de concepts ainsi qu ' une relation de reseau entre 
les concepts. Une representation graphique du reseau de concepts, 
comprenanc une pluralite d'icones correspondant aux concepts, ainsi 
cu'une pluralite de iignes de connexion correspondant aux relations entre 
les concepts, esc affichee. Chaque concept est associe a un ou plusieurs 
programmes d ' appl ica t ion iogiciei tel qu'un craitement de mots ou un 
programme utilitaire tableur. Les utilisateurs peuvent selectionner un 
-joncept couranc en dialoguant avec la representation graphique, puis le 

rncepc courant est traite par invocation automatique du programme 
^'application associe au concept courant en mode transparent. Les 
utilisateurs peuvent modifier sans inconvenient le reseau de concept s en 
i^def inissant de maniere interactive les lignes de connexion entre les 
concepts. Selon une autre variante, des valeurs d'attribut sont associees 
aux differents concepts du reseau, puis une recherche est effectuee sur 
le reseau pour identifier un sous-ensemble de concepts dont les valeurs 
d'attribut sont egales a un ensemble desire de valeurs. Selon d'autres 
variantes, des techniques permettent d'organiser des concepts 
selectionnes du reseau pour des operations desirees a des moments 
determines, et d ' enregistrer les statistiques temporelles et 
d ' utilisation afin de conserver un historique des taches de traitement 
effectuees sur chaque concept. 
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Detailed Description 

Detailed Description 

for personal computers by the Appleg and Microsoft Windowsg operating 
systems, also simulates a 
I 

remedied . 

The recent deluge of digital information bombarding everyday computer 
users from the Internet only heightens the need for a unified, simple 
information management method which groups of users . 

SUMMARY OF THE INVENTION 
1 0 The present invention enables users to organize information on a 
digital computer in a flexible, associative manner, akin to the way in 
which information is organized by the human mind. Accordingly, the 
present invention utilizes highly flexible, associative matrices to 
organize . . . 
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English Abstract 

The invention disclosed herein relates to cooperative computing 
environments (10) and information retrieval and management methods and 
.systems. More particularly, the present invention relates to methods and 
systems for capturing and generating useful information about a user's 
access and use of data on a computer system (12), such as in the form of 
documents stored on remote servers, and making such useful information 
available to others. Documents on the computer system (10) are accessible 
through a plurality of different methods, such as by specifying an 
identifier or locator for the document, activating a hyperlink (14) in 
another document which points to the document, or navigating to the 
document through navigational commands in an application program (26) 
such as a browser (18). The method involves capturing information 
regarding each of the accessed documents in the set, the information 
including the method used to access the document, dividing the set of 
documents, labeling (30) each subset of documents with a topic (32), and 
making the labels (34) and documents accessed available to other users 
who wish to browse the same documents. 

French Abstract 

La presente invention concerne des environnements inf ormatiques 
cooperatifs (10) ainsi que procedes et des systemes de gestion et 
d'extraction d ' informations . Plus particulierement , cette invention 
concerne des procedes et des systemes de saisie et de production de 
donnees utiles relatives a I'acces d'un utilisateur et a 1 ' utilisation 
des donnees dans un systeme d'ordinateur (12), par exemple sous la forme 
de documents memorises sur des serveurs a distance, permettant que ces 
informations utiles soient disponibles pour d'autres ut ilisateurs . Les 
documen t s contenus dans le systeme d'ordinateur (10) sont accessibles par 
mc- plural! te de differents procedes, par exemple en specif iant un 
. -lent :i f. i ca teur ou un localisateur pour le document, en activant un lien 
:'.yperce:-:te (14) dans un autre document citant ce document, ou encore en 
-irivigant a cravers le document au moyen des commandes de navigation d'un 
E>L'oc:rainme d ' applica t ion (26), tel qu'un explorateur (18), En outre, ce 
procede consiste d'abord a saisir les informations concernant chacun des 
documents explores parmi 1' ensemble de documents, ainsi que les 
informations contenant le procede utilise pour acceder au document, a 
diviser ensuite l'ensemble de documents et a etiqueter (30) chaque 
sous-ensemble de documents selon un theme (32) pour enfin rendre les 
etiquettes (34) et les documents accessibles a d'autres utilisateurs 
desirant explorer les memes documents. 

Main International Patent Class: G06F-017/30 
Fulltext Availability: 
Detailed Description 

Detailed Description 

. . . expertise in a particular field has already read. 



It is another object of the present invention to account for a user's 
method of accessing documents in determining how to group together 
sets of related documents . 

I 0 The above and other objects are achieved by a method for producing a 
summary of topics for a set of documents accessed by a user on a 
computer system. 

Documents on the computer system are accessible through a plurality of 
different methods, such as by specifying an identifier or locator for the 
document, activating a... the document through 1 5 navigational commands 
in an application program such as a browser. The method involves 
capturing information regarding each of the accessed documents in the 
set , the information including the method used to access the document , 
dividing the set of documents into subsets of documents based at 
least in part on the methods used to access the documents, and labeling 
each subset of documents with a topic. 

The method of . . . 
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English Abstract 

A document storing and retrieving system in a networking environment 
comprising a plurality of individual components, wherein the components 
interact and may be interfaced with external systems through platform 
independent communication mechanisms, i.e. JAVA, CORBA. The system may 
s: ore and retrieve meta data (index data) as well as documents into and 
from the same database so that both documents and meta data pertaining to 
the respective documents may be contained within the same database. 
Preferably, the document management system communicates with the database 
through a standard, pla t form- independent communication mechanism, such as 
JDBC. In particular, the invention concerns a system comprising at least 



one system controlled scanner for scanning paper documents and producing 
Graphic image files representing the paper documents. 

French Abstract 

L' invention porte sur un systeme de stockage et d'extraction de 
documents dans un environnement de reseau comprenant une pluralite de 
composants individuels. Ces composants ont une interaction et peuvent 
etre interfaces avec des systemes externes par 1 ' intermediaire de 
mecanismes de communication tels que JAVA, CORBA, independants de la 
plate-forme. Le systeme peut stocker et extraire, de la meme base de 
donnees, des donnees meta (donnees d'indice) ainsi que des documents de 
sorte que ces donnees meta et ces documents appartenant a des documents 
respectifs puissent etre contenus dans la meme base de donnees. De 
preference, le systeme de gestion de documents communique avec la base de 
donnees par un mecanisme de communication standard, independant de la 
plate- forme tel que JDBC. Cette invention porte notamment sur un systeme 
comprenant au moins un lecteur commande par le systeme pour lire des 
documents papiers et produire des fichiers d' images graphiques 
representant les documents papiers. 
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Detailed Description 

... the class dependant on the type of search carried out. When a search 
has been carried out, the search manager will have produced a result set 
which contains information from documents that match the search 
^i-eria. The user can then choose to have the result set displayed 
; ei ther a summary view or tree view format, then select to view a 
documents content with an appropriate viewer. 

The components in the retrieve process communicate via events so there is 
no dependencies between the classes. 

The class diagram. . . 
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English Abstract 

In a data mining system (12), clusters are used to categorize data 
within each model. An initial set of estimates of the parameters of each 



model and each cluster are provided. A portion of the data in the 
database (10) is read from a storage medium and brought into a rapid 
access memory buffer (22} . Data contained in the data buffer (22) is used 
to update the original guesses at the parameters of the model in each 
cluster over all models. Some of the data belonging to a cluster is 
summarized or compressed and stored as a reduced form of the data 
representing sufficient statistics of the data. If further data is needed 
to categorize the cluster, more data is gathered from the database (10) 
and used in combination with compressed data until a stopping criteria 
(140) is met. 

French Abstract 

Dans un systeme d ' exploitation de donnees (12), on utilise des groupes 
pour classer les donnees dans chaque modele. On prevoit un ensemble 
initial d ' est imat ions des parametres pour chaque modele et chaque groupe. 
Une partie des donnees dans la base de donnees (10) est lue a partir d'un 
support de memorisation et envoyee dans une memoire tampon (22} rapide 
d'acces. Les donnees contenues dans la memoire tampon (22) sont utilisees 
pour met ere a jour les estimations initiales au niveau des parametres du 
tnodele dans chaque groupe tout au long des modeles. Certaines donnees 

■,op-i r!: enan t a un groupe sont resumees ou comprimees et enregistrees sous 
: : r.c r^duite, ces donnees representant des statistiques suffisantes des 

: -.: -;-es. Si d'autres donnees sont necessaires pour classer le groupe, 

iiVnMoqe de donnees sont recueillies a partir de la base de donnees (10) 
• • :l: Usees en combinaison avec les donnees comprimees jusqu'a ce qu'on 
puisse repondre aux criteres d'arret (140). 
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Detailed Description 

. . . means the parameters are the means or centroids of the K clusters) 
computed so far. A list structure desiornated LOWER is a vector of k 
elements (one for each cluster ) where each element points to a 
vector of n elements (floats) holding the lower bounds for each attribute 
of the CI on the mean of the corresponding cluster... 

...value of the lower bound on the CI for the third cluster along dimension 
2. A second structure designated UPPER is a vector of K elements (one 
for each Cluster ) where each element points to a vector of n 
elements (floats) holdincy the upper bounds for the CI on the parameters 
of the model ( mean or centroid in case of K-iTieans) of the 
corresponding cluster . Singleton Points ( Elements of RS) not 
changing cluster assignment when the K cluster centers are perturbed, 
ID 

within their respective confidence intervals in a worst-case fashion, can 
be summarized 

by adding them to the set DS and removing them from RS . Appendix A is 

1 3 

SUBSTITUTE SHEET (RULE 26) 

summarization of the Worst Case Analysis that defines LOWER. . . 
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Erg 1. i s h Abs t ra c t 

A computer-based electronic document and/or paper-based document 
management application program. The program provides an efficient way to 
automatically import, index, categorize, store, search, retrieve, 
manipulate and archive electronic documents. The program is also capable 
of managing documents regardless of document type or document format. 

French Abstract 

L' invention porte sur un programme d ' application de gestion par 
ordinateur de documents elect roniques et/ou sur papier. Ledit programme 
est une maniere efficace d ' assurer automatiquement 1 ' importation, le 
classement, le tri par categories, le stockage, la recherche, la 
recuperation, la manipulation et l'archivage de documents electroniques , 
et cela independamment du type et du format des documents. 

Main International Patent Class: G06F-017/30 

Fulltext Availability: 
Detailed Description 

Detailed Description 

... information for an electronic document. 

It is another object of the present invention to provide a user with a 
way to quickly browse through a document collection and identify a 
specific electronic document without first having to open each 
document, along with a corresponding host application program. 

It is yet another object of the present invention to provide a user with 
a way to quickly and efficiently browse through a collection of 
electronic documents and identify a specific electronic document by 
displaying summary information for the electronic document. 

In accordance with one aspect of the present invention, the foregoing and 
other objects are achieved by a method for identifying an electronic 
document in an electronic document collection . The method involves 
generating summary information for the electronic document based upon 
an electronic analysis of the document , then storing the summary 
information in a document data structure corresponding to the 
electronic document regardless of document type or document format. The 
method also involves displaying a representation of the electronic 
document, and activating a... 

. . .the electronic document. 

in accordance with another aspect of the present invention, the foregoing 
and other objects are achieved by a method for browsing a collection of 
electronic documents and/or a computer-readable storage medium having 
stored therein an electronic document management program. The method 
and/or program involve analyzing an electronic document... 
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English Abstract 

A method of selecting a subset of a plurality of document collections 
for searching in response to a predetermined query is based on accessing 
a meta-inf ormation data file that describes the query significant search 
terms that are present in a particular document collection correlated to 
normalized document usage frequencies of such terms within the documents 
of each document collection. By access to the meta-inf ormation data file, 
a relevance score for each of the document collections is determined. The 
method then returns an identification of the subset of the plurality of 
document collections having the highest relevance scores for use in 
evaluating the predetermined query. The meta-inf ormation data file may be 
constructed to include document normalized term frequencies and other 
contextual information that can be evaluated in the application of a 
query against a particular document collection. 
French Abstract 

L' invention se rapporte a un procede de selection d'un sous-ensemble 
d'une pluralite de collections de documents destines a faire l'objet 
d'une recherche, en reponse a une demande preetablie . Ce procede consiste 
~. acceder a un fichier de metadonnees decrivant les termes signif icat if s 
■t" recherche associee a la demande qui sont presents dans une collection 

:•• documents oarticuiiere correlee a des frequences normalisees 

:• i lisat ion de ces termes au sein des documents de chaque collection de 

it.-: jrnents . L'acces au fichier de metadonnees permet d'attribuer une note 
de pertinence a chaque collection de documents. Le procede permet ensuite 
d'obtenir une identification du sous-ensemble de la pluralite de 
collections de documents possedant les notes de pertinence les plus 
elevees en vue de I'utiliser pour evaluer la demande preetablie. Le 
fichier de metadonnees peut etre construit pour comporter des frequences 
normalisees de termes de documents et d'autres informations contextuelles 
qui peuvent etre evaluees dans 1 ' application d'une demande concernant une 
collection particuliere de documents. 
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* Detailed Description 
. . . The meta-index 16 thus contains a set of documents that directly 
correspond to the set of document collections potentially searchable in 
response to any user query 12. 

The collection meta-index 16 documents can be prepared through a 
preprocessing 22 of base collection indexes 18, 20, often referred to 
generically . . . 

. . .present invention. 

Preferably, the meta-data of the indexes 18, 20 are directly preprocessed 
22 to produce meta-index documents, also referred to as collection 
summary records, of standardized format. Information characteristically 
(language), ifnot explicitly (cost), describing the collection is 
scored in the respective summary records as fielded text or data. 

Thus, the preferred standardized summary record structure preserves a 
combinacion of fielded data, term frequencies for contextually distinctive 
search cerms, and proximity information relating the various search 
terms indexed. A collection summary record may be generated by 
either a collection content provider or a collection access provider, 
rhough the collection content provider will have more immediate access to 
che base collection indexes, knowledge of the specific structure 2 0 of 
the base collection's index files, and knowledge of the specific 
documents added to the base collection since any prior generation of a 
corresponding summary record structure. 

Preferably, the summary record structure is or will be standardized for 
use by all collection access providers who may provide access to 
particular base collections . 

2 5 By utilizing standardized summary record structures, the base 
collection content providers have a standardized basis for supporting 
collection searching independent of the search algorithms utilized by any 
particular content access provider. Similarly, the standardized ... search 
cerms, excluding stopterms and that do not span a sentence terminator, 
fixed in sets of two or more terms as they occur in the documents of a 
base collection. In a preferred embodiment of the 2 0 present invention, 
term phrases can be chosen to be short series of two... 

...summary records are prepared by Che collection content 

providers, or perhaps by a third party service company who operates on 
behalf of some group of collection content providers, each collection 
summary record can be 2 5 pushed, preferably using a secure Internet 
rrotocol, to each of the existing authorized collection access 
: rcviders. The summary records can be prepared and pushed to the 
'..I lection access providers on at least an as needed basis to reflect 
significant updates in the contents of a base collection. Each time a 
roi section access provider properly receives an updated summary record , 
their collection meta-index is correspondingly updated and any prior 
existing summary record is overwritten or deleted. 

Alternately, the content access providers may pull new and updated 
summary records from base collection content providers. Again, the 
actual transfer of the summary records is preferably by a secure Internet 
protocol. This allows the collection content providers to potentially... 

...pulls and therefore the currency of the summary record information that 
any particular content access provider receives. 

The content access provider may directly utilize the collection summary 
0 records to create collection summary records for the meta-index 
16. However, in a preferred embodiment of the present invention, the base 

collection summary records are further processed by specific 
collection access providers separately or in parallel with the 
generation of the base collection indexes to optimize the organization of 
the collection meta-index to any... 
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English Abstract 

A method of organizing electronic documents for storage and subsequent 
retrieval, involves storing a summary structure describing the structure 
of summary records associated with each document. Each structured summary 
record has at least one field representative of a characteristic of the 
document. A predetermined number of field values identify the value of 
the characteristic associated with the field. Predetermined keyword 
criteria associated with the field values are stored. Each document is 
analyzed to build a text index listing the occurrence of unique 
significant words in the document. The text index is compared with the 
keyword criteria to determine the appropriate field value for the 
document. For example, one characteristic field might be related to 
topic, which could have the field values of "financial" or "sports". The 
preponderance of certain keyword criteria, such as "money" or "shares" 
would identify the document with the financial topic. 
French Abstract 

Un procede pour organiser des documents elect roniques en vue de leur 
stockage et extraction ulterieure, consiste a memoriser une structure 
sommaire decrivant la structure de resumes associes a chaque document. 
Chaque resume structure a au moins un champ representant une 
caracteristique du document. Un nombre predetermine de valeurs de champ 
identifie la valeur de la caracteristique associee audit champ. Des 
criteres a mot-cle predetermines associes aux valeurs de champ sont mis 
en memoire. Chaque document est analyse pour construire un index de texte 
enuraerant 1' occurrence des mots uniques signif icati f s dans le document. 
L' index de texte est compare avec les criteres a mot-cle pour determiner 
la valeur de champ appropriee pour le document. Par exemple, un champ de 
caracteristique pourrait etre associe a un sujet, ledit sujet ayant comme 
valeurs de champ les mots "financier" ou "sport". La preponderance de 
certains criteres a mot-cle, tels que "argent" ou "actions", 
identif ierait le document comme appartenant au sujet financier. 
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Detailed Description 

. . . a "current candidate" and its word count (to be described) . At block I 
1, the system is also initialized to set the current candidate and 
corresponding At step 12, the system sets the summary record field 
name to the next unique field name in the summary structure database 
starting from the first, and at 13 retrieves from the summary candidate 
database the next summary candidate (selected candidate) also starting 
from the first having a field name matching the summary record field 
name that has just been set . For example, the first summary record 
field name might be "category". 

The first summary candidate with a field name category might be 
"financial' having the criteria keywords noted above. 

Next , the . . . 
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English Abstract 

An information storage, searching and retrieval system for large 
(gigabytes) domaines of archived textual data. The system includes 
multiple query generation processes, a search process, and a presentation 
of search results that is sorted by category or type and that may be 
customized based on the professional discipline (or analogous personal 
characteristic of the user), thereby reducing the amount of time and cost 
required to retrieve relevant results. 

French Abstract 

L' invention concerne un systeme de stockage, de recherche et 
d'extraction d ' inf ormat ions pour de vastes (gigaoctets) domaines de 
donnees de textes archivees. Ce systeme comprend plusieurs processus de 
generation d ' interrogations, un processus de recherche, et une 
presentation des resultats de recherches qui sont tries par categorie ou 
par type. En outre, ces derniers peuvent etre personnalises en fonction 
de la categorie prof essionnelle (ou de caracterist iques personnelles 
analogues de 1 ' utilisateur ) , ce qui permet de reduire le temps requis et 
les couts associes a 1' extraction des resultats recherches. 
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corresponds to category #2, then all of the documents responsive to 
the search query that fall within these categories are lumped together in 
che category " Product Information" in categories set #2. Thus, the 
same query launched by two users corresponding to different 
categories will yield the same answer set, but the answer set will be 
summarized differently for the two individuals , each being tailored to 
their particular needs. This customization of the summary of the search 
results facilitates review of the search results, saving time for... 

documents responsive to the query which fall within various 
predetermined 

categories of document types. 

1 1 The system of claim 10 wherein the means for categorizing 
documents and generating the summary includes a plurality of 

predetermined sets of categories of document types, each category in 
a set corresponding to one or more document types. 

12 The system of claim 1 1 wherein the means for generating the 
summary includes means for customizing the summary for the user by 
automatically selecting one of the sets of categories for use in 
preparing the summary , such set of categories being selected ba sed 
on predetermined criteria relating to the identity of or a' personal 
characteristic of the user, so that the summary for an individual user is 
automatically customized for the user based on 

the user's identity or such personal characteristic of the user . 

13 The system of claim 10 wherein the means for generating the 
summary includes a plurality of predetermined sets of categories of 

document types, each category corresponding to one or more document 
types, the means for generating the summary further including means for 
automatically customizing the summary by automatically selecting one of 
the sets of categories, based... 

.such documents 

were obtained, including means for generating a summary of the number of 
documents responsive to the query which fall within each of the document 
types . 

15 The system of claim 14 wherein the means for generating the 

summary includes one or more predetermined sets of categories of 
document types, each category corresponding to one or more document 
types, and further includes means for summarizing the number of documents 
responsive to the query which fall within the various predetermined 
categories of a selected... 

. the 

summary includes means for customizing the summary for the user by 
automatically selecting one of the sets of categories for use in 
preparing the summary , such set of categories being selected based on 
predetermined 
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English Abstract 

"a scalable clustering algorithm (12) accesses database (10) of records 
having attributes or data fields of both enumerated discrete and ordered 
values and brings a portion of the data records into a rapid access 
memory. A cluster model for the data includes a table of probabilities 
(160) for the enumerated, discrete data fields of the data records. The 
cluster model for data fields that are ordered comprises a mean and 
spread of the cluster. The cluster model is updated from the database 
records brought into the rapid access memory. Some of the database 
records in the rapid access memory are summerized and stored within the 
rapid access memory. A criteria is evaluated to dermine if further data 
should be accessed from the database to further cluster data records in 
the database. Additional database records in the database are accessed 
rind brought into the rapid access memory for further updating of the 
-luster model . 
r : -h Abst ract 

invention concerne un algorithme de groupement a echelle variable (12) 
'Mi permet d'acceder a une base de donnees (10) dans laquelle les 
enregis trement s ont des attributs de champs de donnees dont les valeurs 
sont a la fois discretes, enumerees, et ordonnees. L ' algorithme permet 
d'introduire une partie des donnees dans une memoire a acces rapide . Un 
modele de groupement pour les donnees est presente, qui comprend une 
table de probabilites (160) correspondant aux champs de donnees 
discretes, enumerees, des enregist rements de donnees. Le modele de 
groupement pour les champs de donnees ordonnees fournit une indication de 
moyenne et de variabilite pour le groupement. Le modele est actualise a 
partir des enregist rement s introduits dans la memoire a acces rapide. 
Certains enregist rements introduits dans la memoire a acces rapide sont 
resumes et stockes dans ladite memoire. L'evaluation d'un critere permet 
de determiner s'il convient d'acceder a des donnees supplementaires 
depuis la base de donnees pour poursuivre le groupement d ' enregist rements 
dans ladite base de donnees. Ensuite, on accede a des enregistrement s 
supplementaires dans la base de donnees, afin d'introduire ces 
enregistrements dans la memoire a acces rapide et de poursuivre ainsi 
1 ' actualisation du modele de groupement. 
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... sqrt(0.25) = O.S. Hence, in the worst 
19 

SUBSTITUTE SHEET (RULE 26) 

case, thestandarddevationisO Sinceptakesvaluesbetween [ 0, 1] , we threshold 
the standard deviation of the probability by 0/2. 

Set CS-Temp = CS - New u CS . Augment the set of previously computed 
sufficient statistics CS with the new ones surviving the... 

...n: sufficient statistics s {corresponding to a sub-cluster) in CS-Temp 
determine the s 1 . the set of sufficient statistics in CS-Temp with 
highest probability of membership in the subcluster represented by s. 

If the subcluster formed by. merging s and s', denoted by merge ( s , s') 
is such that the maximum standard deviation along any continuous 
dimension is less than 0 or the maximum standard deviation of an entry 
in the attribute/ value probability table is greater than 0 /2 (P in 
the range [ 0 , 1 ]), then add merge (s, s r ) to CS-Temp and remove S 
and s' from CS-Temp. 

Set CS = CS-Temp. Remove from RS all points that went into CS, (RS = RS 
- CS . ) Note that the vectors Sum, Sumsq, values of M and the 
attribute/value probability tables for the newly-found CS elements were 
determined in the subclust ering process or in the merge processes. Note 
that the function merge (s, s') simply computes the sufficient statistics 
for the sub-cluster summarizing the points in both s and s'fi.e. computes 
Sum, Sumsq, K attribute value probabilities the sub 
cluster consisting of points in s and s 
Data Structures 

Data structures used during performance of the clustering evaluation are 
found in Figures . . . 

...and an attribute/value probability table P (entries are floats) such as 
the table of Figure 9A. 

Thii number M represents the number of database records represented by a 
: ; v^n cluster . The model includes K entries, one for each cluster. 

7:.- vector 'SUM' represents the sum of the weighted contribution of each 
ol the n continuous. . . 
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English Abstract 

Building resource (e.g., Internet content) and attribute transition 
probability models and using such models for pre-fetching resources, 
editing resource link topology, building resource link topology 
templates, and collaborative filtering. 

French Abstract 

La presente invention concerne la construction de modeles de probabilite 
de transition de ressources (par exemple, de contenu Internet) et 
d'attribut et 1 ' utilisation de ces modeles pour la preext raction de ces 
ressources, pour 1' edition de la topologie des liens des ressources, pour 
la construction de gabarits de la topologie des liens des ressources et 
pour le filtrage cooperatif. 

Main International Patent Class: G06F-017/30 
Fulltext Availability: 
Detailed Description 

Detailed Description 
. . . are available in 
.'aster memory. 

As discussed above with reference to Figure 35, 

•j.sers may be clustered to define a number of transition 

probability matrices. To reiterate, free parameters of a 
probabilistic model that might have generated the usage 
log data are estimated. These free parameters are used 
to. . . 

. . . the associated 

transition probability matrices. Thus, when a new user 
arrives at an Internet site, that user is classified into 
one (or more) of the clusters of users . The probability 
that the new user belongs to a given cluster k of the m 
clusters can be determined as follows. 

bli k) = p(l --> kj n{k), (,)P, (,)p) 
1 5 OC P(n (k. . . 

...the new user may be determined to belong to the 
(k) 

cluster having the maximum value for 81 Alternatively, 

since all of the 81 (k) values should have a value between 0 

and 1, the new user may be determined to partly belong to 

all of the clusters, in a proportion determined by the 

U) 

probability 81 

determining a pre-fetch resource occurs as 
follows. If the new user is determined to belong to only 
one cluster of users , the transition probability matrix 
from that cluster of users is used to determine the most 
-1 1 

likely resource to be requested given the last resource 
requested. If, on the other hand, the new user is 
determined to partially belong to all of the m clusters 
of users , the transition probability matrices associated 
with the clusters of users , as weighed by the 
(k) 

probabilities 81 f are used to determine the most likely 
resource to be requested given the last resource 
requested. 



{ section . 
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English Abstract 

Methods for assigning a quantitative score to the relatedness of aligned 
polymorphic biopolymer sequences such that small differences between 
otherwise identical sequences are highlighted are disclosed, including 
computer systems and program storage devices for carrying out the methods 
on a computer. Specifically, the methods of the invention comprise the 
steps of providing a test sequence and a basis set of sequences such that 
the test sequence and a basis set of sequences are aligned; determining 
the identity of a monomer unit at a position m in the test sequence; 
assigning a value of 1 to a local matching probability xm if the monomer 
unit at position m in the test sequence matches any members of the basis 
set at position m, or, assigning a value of between 0 and 1 to a local 
matching probability xm if the monomer unit at position m in the test 
sequence does not match any members of the basis set at position m. In a 
preferred embodiment, the above method is performed at a plurality of 
sequence locations and the local matching probabilities are multiplied 
together to provide a global matching probability. 

French Abstract 

L'invention porte sur des procedes d ' attribution d'un indice quantitatif 
de parente entre des sequences alignees de biopolymeres polymorphes 
permettant de mettre en evidence de petites differences entre des 
sequences sinon identiques, et sur les systemes inf ormat iques et les 
dispositifs de stockage de programmes permettant la mise en oeuvre 
informatisee desdits procedes. Ces procedes comprennent speci f iquement 
les etapes suivantes: recueillir une sequence d'essai et un ensemble de 
base de sequences qui sont disposes de maniere a etre alignes; determiner 
l'identite d'un monomere en position m de la sequence d'essai; attribuer 
la valeur 1 a une probabilite locale de correspondance xm lorsque le 
monomere en position m dans la sequence d'essai correspond a l'un des 
elements de 1* ensemble de base en position m, ou attribuer une valeur 
entre 0 et 1 a la probabilite locale de correspondance xm lorsque le 
monomere en position m de la sequence d'essai ne correspond a aucun des 
elements de 1' ensemble de base en position m. Dans la variante preferee, 
le susdit precede s'effectue pour differents emplacements de sequences, 
et les probabilites locales de correspondance sont multipliees entre 
elles pour fournir une probabilite globale de correspondance 

Main International Patent Class: G06F-017/30 
Fulltext Availability: 
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Detailed Description 
Detailed Description 

unit at position m in the 
test sequence does not match any of the members of the basis 
sec at position m, the local matching probability xm is 
assigned a value of between 0 and 1 . Conceptually, xm 
corresponds to a maximum probability that a monomer unit is in 
2S fact present at position m in at least one of the basis 
templates used to generate the basis... 

. . .monomer unit is not 

represented at position m in any of the members of the basis 
set, the method of the invention assigns a finite probability 
that such monomer unit is in fact present in the population of 
basis templates used to generate the basis set, but is present 
at levels . . . 

. . . when 

the monomer unit at position m does not match the members of 
the basis set of N sequences is according to the relation 
X@ = 0 -P)" 

where p is a number between 0 and 1 and n is the number 
sequences in the basis set having an element at position m. 
Note that when the sequences of the basis set overlap at every 
position m, then n=N for each position m. However... 
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Sec Items Description 

51 7844 (SUMMARIZED OR SUMMARISED OR SUMMARY) ( 1W) (VALUE? ? OR NUMB- 

ER? ? OR NUMERAL? ? OR RESULT? ?) 

52 6701 (SUMMARIZED OR SUMMARISED OR SUMMARY) (5N) (GROUP???? OR SET? 

? OR CLUSTER? ? OR COLLECTION? ?) 

53 14 3514 5 (AVERAGE OR AVG OR EXPECTED) (1W) (VALUE? ? OR NUMBER? ? OR - 

NUMERAL? ? OR RESULT? ?) OR MEAN 

54 784 9095 RECORD? ? OR DOCUMENT? ? OR ARTICLE? ? OR ITEM? ? OR ELEME- 

NT? ? OR FILE? ? OR PRODUCT? ? OR MERCHANDISE? ? 

55 2745248 IMAGE? ? OR PHOTO? ? OR PHOTOGRAPH? ? OR PICTURE? ? OR GRA- 

PHIC? ? 

56 4095660 PROFILE? ? OR USER? ? OR CONSUMER? ? OR CUSTOMER? ? OR BUY- 

ER? ? OR PURCHASER? ? OR SHOPPER? ? OR INDIVIDUAL? ? OR PERSO- 
N? ? OR PEOPLE? ? 

57 836420 S4:S6(5N) (GROUP???? OR SET? ? OR CLUSTER? ? OR COLLECTION? 

?) 

58 6731153 RECOMMEND? OR PREDICT? OR GUESS??? OR SUGGEST? OR REFER? ? 

OR REFERRAL? ? OR REFERRING OR FORECAST??? OR PROBABILIT? 

59 22209 S1:S3(7N)S4 :S6(7N) (COMPAR? OR CORRELAT? OR MATCH??? OR REL- 

ATE? ? OR RELATING OR SIMILAR? OR LIKEN??? OR CORRESPOND? OR - 
ASSOCIAT? OR JUDG??? OR WEIGH??? OR MEASUR???) 

510 44662 (VALUE? ? OR NUMBER? ? OR NUMERAL? ? OR INTEGER? ?) (5N) (RA- 

NGE? ? OR SERIES OR BETWEEN OR "FROM") (5N) (ZERO OR 0) (5N) (ONE 
OR 1) 

511 232 S1:S2 AND S7 AND S8 
Si 2 34 Sll AND S9 

io RD (unique items) 
Tl5 SI :S2 (5N)S4 : S6(5N) (COMPAR? OR CORRELAT? OR MATCH??? OR REL- 
ATE? ? OR RELATING OR SIMILAR? OR LIKEN??? OR CORRESPOND? OR - 
ASSOCIAT? OR JUDG??? OR WEIGH??? OR MEASUR???) 
Sib 7 4 SI 4 AND S7 : S8 

SI 6 56 RD (unique items) 

517 33 S16 NOT (S13 OR PY=2001 : 2004 ) 

518 76 S10 (20N) PROBABILIT? (20N) (GROUP???? OR SET? ? OR CLUSTER? ? 

OR COLLECTION? ?) 

51 9 57 RD (unique items) 
I3i9' S19 NOT PY=2001:2004 
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DIALOG (R) File 3:Ei Compendex(R) 

<c! 2004 Elsevier Eng. Info. Inc. All rts. reserv. 

— ; I J 4 3 E.I. No: EI P0 120495859"? 
Title: Image retrieval using hierarchical self -organizing feature maps 

Author: Sethi, I.K.; Coman, I. 

Corporate Source: Wayne State Univ, Detroit, MI, United States 
Conference Title: Proceedings of the 1999 Pattern Recognition in Practice 
(PRP VI) 

Conference Location: Vlieland, Neth Conference Date: 19990602-19990604 
E.I. Conference No.: 56190 

Source: Pattern Recognition Letters v 20 n 11-13 Nov 1999. p 1337-1345 

Publication Year: 1999 

CODEN: PRLEDG ISSN: 0167-8655 

Language: English 

Document Type: JA; (Journal Article) Treatment: A; (Applications); G; 
(General Review) 

Journal Announcement: 0105W2 

Abstract: This paper presents a scheme for image retrieval that lets a 
user retrieve images either by exploring summary views of the image 
collection at different levels or by similarity retrieval using query 
images . The proposed scheme is based on image clustering through a 
hierarchy of self-organizing feature maps. While the suggested scheme 
can work with any kind of low-level feature representation of images, our 
implementation and description of the system is centered on the use of 
image color information. Experimental results using a database of 2100 
images are presented to show the efficacy of the suggested scheme. 
(Author abstract) 15 Refs. 

Descriptors: 'Pattern matching; Query languages; Image analysis; 
i r. ; orma t ion retrieval; Color image processing; Hierarchical systems; Data 
:- r r jcures 

Identifiers: Exploration-based retrievals; Self -organizing feature maps 
Classification Codes: 

-23.5 (Computer Applications); 723.3 (Database Systems); 723.2 (Data 
Processing); 903.3 (Information Retrieval & Use); 741.1 (Light & Optics) 

723 (Computer Software, Data Handling & Applications); 903 (Information 
Science); 741 (Light, Optics & Optical Devices) 

72 (COMPUTERS & DATA PROCESSING); 90 (ENGINEERING, GENERAL); 74 (LIGHT 
£ OPTICAL TECHNOLOGY) 
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(c) 2004 Elsevier Eng. Info. Inc. All rts. reserv. 

02968469 E.I. Monthly No: EI9010123291 

Title: Methods of digraph representation and cluster analysis for 
analyzing free association. 

Author: Miyamoto, S.; Suga, S.; Oi, K. 

Corporate Source: Univ of Tsukuba, Inst of Inf Sci & Electron, Tsukuba, 
Jpn 

Source: IEEE Transactions on Systems, Man and Cybernetics v 20 n 3 
May-Jun 1990 p 695-701 
Publication Year: 1990 
CODEN: 1SYMAW ISSN: 0018-9472 
Language: English 

Po."i:nent Type: JA; (Journal Article) Treatment: T; (Theoretical) 
'-jurnal Announcement: 9010 

AnsL race: A method for constructing two measures of association between a 
•eh : r of words that distribute over a sequence is developed. The association 
measures are used for digraph representation and cluster analysis. In 
particular, study of a measure for cluster analysis leads to a new 
algorithm for hierarchical agglomera tive clustering. The digraph 
representation and the cluster analysis are applied to data of free 
(psychological) association obtained from a questionnaire survey on the 
living environment of local residents. The two association measures are 
interpreted as estimates of probabilistic parameters. Hence, methods of 



hypothesis nesting are developed for showing differences of structures o 
the free associations between two different populations. The results o 
the analysis of the association data are summarized into figures of 
digraphs and clusters that show structures of free associations of 
groups of people . 7 Refs. 

Descriptors: SYSTEMS SCIENCE AND CYBERNETICS- -^Cognitive Systems; 
MATHEMATICAL TECHNIQUES — Graph Theory; STATISTICAL METHODS — Statistical 
Tests; PROBABILITY — Random Processes 

Identifiers: CLUSTER ANALYSIS; COGNITIVE SCIENCE; PATTERN CLUSTERING 
METHODS; HIERARCHICAL AGGLOMERATI VE CLUSTERING; DIGRAPH REPRESENTATION; 
FREE ASSOCIATION ANALYSIS 

Classification Codes: 

912 (Industrial Engineering & Management); 921 (Applied Mathematics) 
922 (Statistical Methods) 

91 (ENGINEERING MANAGEMENT); 92 (ENGINEERING MATHEMATICS) 
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(c) 2004 Elsevier Eng. Info. Inc. All rts. reserv . 

0:^0425 E.I. Monthly No: EI8604028002 E.I. Yearly No: EI86021560 
Title: PROBABILISTIC LOGIC. 

AuehGr:: Nilsson, Nils J, 

Corporate Source: Stanford Univ, Computer Science Dep, Stanford, CA, USA 

Source: Artificial Intelligence v 28 n 1 Feb 1986 p 71-87 

Publication Year: 1986 

CODEN: AINTBB ISSN: 0374-2539 

Language: ENGLISH 

Document Type: JA; {Journal Article) Treatment :' T ; (Theoretical) 
Journal Announcement: 8604 

Abstract: Because many artificial intelligence applications require the 
ability to reason with uncertain knowledge, it is important to seek 
appropriate generalizations of logic for that case. We present here a 
semantical generalization of logic in which the truth values of sentences 
are probability values ( between 0 and 1 ). Our generalization 
applies to any logical system for which the consistency of a finite set 
of sentences can be established. The method described in the present paper 
combines logic with probability theory in such a way that probabilistic 
logical entailment reduces to ordinary logical entailment when the 
probabilities of all sentences are either 0 or 1 . (Author abstract) 18 
ref s . 

Descriptors: ^COMPUTER METATHEORY--* Formal Logic; ARTIFICIAL INTELLIGENCE 

; PROBABILITY 

Identifiers: PROBABILISTIC LOGIC 
Classification Codes: 

723 (Computer Software); 922 (Statistical Methods) 

'COMPUTERS f. DATA PROCESSING); 92 (ENGINEERING MATHEMATICS) 



20/5/9 (Item 1 from file: 202) 

DIALOG ( R) File 202: Info. Sci . & Tech. Abs . 
(c) 2004 EBSCO Publishing. All rts. reserv. 

1905289 

Recent developments in the theory of information retrieval. 

Book Title: Report No: ED 232 691 
Author (s): Bookstein, A 
(29 pages) 

Publication Date: Dec 1982 
Publisher: Royal Inst, of Tech, 
Language: English 
Place of Publication: Sweden 
Document Type: Book Chapter 
Record Type: Abstract 
Journal Announcement: 1900 

Recently considerable attention has been given in the online information 
retrieval literature to techniques for producing a weighted output of 
documents in response to a request. One approach tries to maintain the form 
of and relationships among requests as they appear in current Boolean 
logic-based systems, while extending it to permit a weighted output. It is 
based on the mathematics, of fuzzy- set theory, which assigns each 
potential member of a set a degree of membership between zero and 
one with intermediate values denoting partial membership in the set . 
Ai.jiner approach is based on the mathematics of probability . It 
represents requests by 'sets of terms, and, by means of feedback 
information, assigns a weight to each term. Documents are ordered by the 
sum of weights of the terms tin the request that match those in the 
documents. This paper provides an overview of both approaches and their 
advantages and disadvantages. 

Descriptors: Documentation; Information retrieval 

Classification Codes and Description: 5.11 (Searching and Retrieval) 
Main Heading: Information Processing and Control 
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20/5/48 (Item 11 from file: 34) 

DIALOG { R) File 34 : SciSearch ( R) Cited Ref Sci 
(c) 2004 Inst for Sci Info. All res. reserv. 

00991734 Genuine Articled : FL8S0 Number of References: 21 
Title: A COMPUTER-SIMULATION STUDY OF CAVITIES IN THE HARD DISK FLUID AND 
CRYSTAL 

Author (s): SPEEDY RJ; REISS H 

Corporate Source: VICTORIA UNIV WELLINGTON, DEPT CHEM, POB 

600 /WELLINGTON/ /NEW ZEALAND/; UNIV CALIF LOS ANGELES, DEPT CHEM & 
BIOCHEM/LOS ANGELES/ /CA/90025 

Journal: MOLECULAR PHYSICS, 1991, V72, N5, P1015-1033 

Language: ENGLISH Document Type: ARTICLE 

Geographic Location: NEW ZEALAND; USA 

Subfile: SciSearch; CC PHYS--Curren t Contents, Physical, Chemical & Earth 
Sciences 

Journal Subject Category: PHYSICS, ATOMIC, MOLECULAR & CHEMICAL 
Abstract: The number and size of the cavities in a hard disc fluid and 

crystal are calculated in a computer simulation experiment. A cavity 
is a region where there is sufficient space to insert another disc. In 
the higher-density fluid and in the crystal the number of cavities per 
JLsc, n(c), closely follows the exact one-dimensional result n{c) = exp 

- zpV/RT} , where z is the density relative to close packing, over 40 
orders of magnitude. The average size of the cavities, <v>, varies by 
only 3.5 orders of magnitude in the same density range, and, to within 
about 20%, <v> varies as <v> = ( sigma / (pV/RT - 1)]2, where sigma is the 
disc diameter. Across the freezing transition n{c) <v> is exactly 
constant. When the crystal melts to a fluid the number of cavities 
increases by about 50% and their size decreases in proportion, but 
their surface-to-volume ratio only decreases by 5%, showing that they 
have a more compact shape in the fluid. Above one-half of the 
close-packed density the computed values of n(c} and <v> are 
represented precisely by In n(c) = 1 - pV/RT - F(z) and In <v> = 
DELTA-S/R - In (N/V) + F(z), where DELTA-S is the entropy relative to 
the ideal gas. F{z) is exactly zero in one dimension, and we find 
empirically that in two dimensions F(z) = -0.25 + 2 In z in the crystal 
and F(z) - -2.2 - 2 In z in the dense fluid. The number of vacancies 
per disc, n{v), in the crystal is measured and can be represented by 
n(v) = n(c)/[2{z - 0.75)]. At low density there is a large cavity that 
percolates. At z = 0 .237 +/- 0 .003 there is equal probability of 
a cavity or a cluster percolating. The number of cavities reaches 
a maximum of one for every three discs at z = 0 . 38 . Relations 
between cavity, cell and free volume theories are discussed 
empirically and theoretically. 



File 275:Gale Group Computer DB(TM) 1983-2004 /Jan 26 

(c) 2004 The Gale Group 
File 621:Gale Group New Prod . Annou . { R) 1 985-2004 /Jan 26 

(c) 2004 The Gale Group 
File 636:Gale Group Newsletter DB{TM} 1 987-2004 /Jan 26 

(c) 2004 The Gale Group 
File 16:Gale Group PROMT { R) 1990-2004 /Jan 26 

(c) 2004 The Gale Group 
File 160:Gale Group PROMT { R) 1972-1989 

(c) 1999 The Gale Group 
File 148:Gale Group Trade & Industry DB 1 97 6-2004 /Jan 26 

(c)2004 The Gale Group 
File 624 :McGraw-Hill Publications 1985-2004 /Jan 26 

(c) 2004 McGraw-Hill Co. Inc 
File 15: ABI /Inform {R) 1 97 1-2004 /Jan 27 

(c) 2004 ProQuest Inf o&Learning 
File 647:CMP Computer Fulltext 1988-2004 /Jan W3 

(c) 2004 CMP Media, LLC 
File 674 : Computer News Fulltext 1989-2004 /Jan W4 

(c) 2004 IDG Communications 
File 696: DIALOG Telecom. Newsletters 1 995-2004 /Jan 15 

(c) 2004 The Dialog Corp. 
File 369:New Scientist 1 994 -2004 /Jan W3 

(c) 2004 Reed Business Information Ltd. 



Set Items Description 

51 11124 {VALUE? ? OR NUMBER? ? OR NUMERAL? ? OR INTEGER? ?)(5N){RA 

NGE? ? OR SERIES OR BETWEEN OR "FROM") (5N) (ZERO OR 0) (5N) (ONE 
OR 1) 

52 62 SI <20N) PROBABILIT??? (20N) (GROUP???? OR SET? ? OR CLUSTER? 

OR COLLECTION? ?) 

53 52 RD (unique items) 

53 N0T PD>20000331 



1 



4/3,K/l (Item 1 from file: 275) 

DIALOG { R) Fi le 275:Gale Group Computer DB { TM) 
(c) 200-3 The Gale Group. All res. reserv. 

01790297 SUPPLIER NUMBER: 16293697 { USE FORMAT 7 OR 9 FOR FULL TEXT) 

Extending probability to fuzzy probability, (includes related articles on a 

mathematical definition of a fuzzy event, and on a mathematical 

definition of linguistic probability) (Technical) 

Hoffman, Mark E. 

AI Expert, v9, nl2, p38(4) 

Dec, 1994 

DOCUMENT TYPE: Technical ISSN: 0888-3785 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 1893 LINE COUNT: 00156 

... E can be calculated as the sum of the probability of each omega | 

that is a member of E, or as the sum of the probability of each omega | 
times its membership in E. 

Mathematical Expression Omitted I 
Let E be a fuzzy event. mu.sub.E|{ omega!) is the membership of 
omega I in E The membership can be any value between 0 nd 1 , 
inclusive, mu.sub.E|( omega I subset or equal to I 0 , 1 |. The 
probability of E is calculated as the sum of the probability of each 
omega | times its membership. 

Mathematical Expression Omitted I 
Let omega I = { omega.sub.il,..., omega . sub . n | } be a sample space. 
P(x) is the linguistic probability of element omega . sub . i | . A linguistic 
probability is a fuzzy set of probabilities where each element p has a 
degree of membership. 

Mathematical Expression Omitted! 
The sum of the probabilities of the omega . sub . i ' s . 



4/3, K/2 (Item 2 from file: 275) 

DIALOG (R) File 275:Gale Group Computer DB (TM) 
<c) 2004 The Gale Group. All rts. reserv. 

01502706 SUPPLIER NUMBER: 11961851 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

What's the code? The Classic Knight's tour problem, (includes The Knight's 
Tour— Intro to Heuristics) (Tutorial) 

Stafford, Dave 

Computer Shopper, vl2, n3, p679(3) 
March, 1992 

DOCUMENT TYPE: Tutorial ISSN: 0886-0556 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 1913 LINE COUNT: 00135 

. . . how this works . 

I j" Count is 1 and we find another move as good as Best, then Count is 
incremented. If random { Count ) returns a zero (it will return a number 
between 0 and Count — in this case a 0 or 1 ) , then Best is set to 
NewMove . The probability is 0.5 If we subsequently find another good 
move, then the probability will be 0.33. The fourth move will have a 
probability of 0.25 (and so on). 

2. An even better method is to examine each candidate move in random 
order and simply keep track of... 
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DIALOG (R) File 275:Gale Group Computer DB (TM) 
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01318760 SUPPLIER NUMBER: 07948020 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

A highly random random-number generator. 

Elkins, T.A. 

Computer Language, v6, nl2, p59(5) 
Dec, 1989 

ISSN: 074 9-2839 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 



WORD COUNT: 2377 LINE COUNT: 00171 

... of my random numbers. Accordingly, whenever any of the 

random- number generator's three sections is outside the range 0-32,767, a 
flag is set . Any sum where the flag is set is then discarded for output 
purposes, and the system loops. 

With all this code, however, the end results are very interesting. 
Each section contributes equally probable integers between 0 and 
32,767. Three such integers are summed and the result is constrained into 
this same range , where every number has the same probability of 
occurrence. Figure 1 illustrates a perfect system mod 3. 

Notice that each of the possible numbers 0, 1, and 2 is used three 
times in each column in generating each final number 0, 1, and 2. It 
follows that the probability of each value 0, 1, and 2 is just 1/3. Few 
Run- random random-number generators can make such a claim. Unfortunately, 
r h : s one . . . 



4/3, K/4 {Item 4 from file: 275) 

OIALOG(R) File 275;Gale Group Computer DB { TM) 
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01284581 SUPPLIER NUMBER: 07240867 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Re-create reality in a spreadsheet, (using at-RAND) (includes related 
article on producing random numbers) 

Genis, Richard C. 
Lotus, v5, nl, p59{4) 
Jan, 1989 

ISSN: 8756-7334 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 2379 LINE COUNT: 00179 

. . . d replace the slip and pull again to determine the service time of 

:iie second visitor. She'd repeat the process until she developed a set of 
random arrival intervals and service times for 15 visitors. 

Computer software eliminates the need for such devices. In 
particular, 1 -2-3 and Symphony provide the function @RAND, which returns 
a fractional number between 0 and 1 , nonincliusive . These numbers 
can be used to represent probabilities , but there will be many more than 
the 100 possibilities you'd get with the pieces of paper in the hat. The 
number will be. . . 
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01277003 SUPPLIER NUMBER: 07375372 

Selecting an uncertainty management system, (technical) 

Rothman, Peter 

AI Expert, v4 , n7, p56(7) 

July, 1989 

DOCUMENT TYPE: technical ISSN: 0888-3785 LANGUAGE: ENGLISH 

RECORD TYPE: ABSTRACT 

...ABSTRACT: Bayes ' rule. Certainty factors use heuristic measures of 
belief and disbelief in a given hypothesis. Dempster-Shaf er evidential 
reasoning applies a mathematical theory which modifies probability theory 
by reflecting the unknown value of a variable. Fuzzy logic arises from 
: :r.r.y set theory, an extension of set theory in which the set 
-remb^rship function may assume values between zero and one . 
' immercially available expert system development tools are evaluated for 
' r.t-i r UMS features. 



4/3, K/6 (Item 6 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB (TM) 
(c) 2004 The Gale Group. All res. reserv. 



•12 584 64 SUPPLIER NUMBER: 07160557 {USE FORMAT 7 OR 9 FOR FULL TEXT) 

Is it a fluke? (probability, sample data, and management decisions) 
(includes related article on origins of probability theory) 

Gardner, Everette S., Jr. 
Locus, v4, nl2, p62{5) 
Dec, 1988 

ISSN: 8756-7334 LANGUAGE: ENGLISH RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 2155 LINE COUNT: 00157 

. . . manager problem, the number of successes is the number of sales 

made, 12. The number of trials is the number of calls made, 100. The 
probability of success in one trial is 25%. Enter 12 in cell D4 , enter 100 
in cell D5, and enter .25 (or 25%) in cell D6. Cell D17 displays 0 .06%, 
the probability that 100 calls will result in exactly 12 sales. 

Before going on, change the values in range D4 . . D6 back to 0 , 
4, and 1 /6. 

THE PROBABILITY TABLE 

The table gives you a more complete listing of the probability 
information pertaining to a sampling problem. To set up the table, first 
enter the labels shown in cell F5 and in range G2..15. Enter a backslash 
and a hyphen (\-) in cell G3... 



4/3, K/7 (Item 1 from file: 16) 

:j; ALOG(R) File .16: Gale Group PROMT (R) 

/0C4 The Gale Group. All rts. reserv. 

•\ '■'■■> \ ;r 3 Supplier Number: 46381772 (USE FORMAT 7 FOR FULLTEXT) 
Irregular tooth spacing reduces roller cone bit tracking problems 

The Oil and Gas Journal, p84 
May 13, 1996 

Language: English Record Type: Fulltext 
Document: Type: Magazine/ Journal ; Trade 
Word Count: 2352 

. . . The algorithm requires the tooth count and row diameter as inputs, 

applies pertinent engineering constraints (for example, minimum section 
between adjacenc inserts) , and produces a set of ranked anti-tracking 
insert layouts. 

The tracking coefficient, called t rackabili ty , is also used to compare 
manually chosen pitching schemes and determine the best selection for a 
given design. Trackability numbers range between zero (least likely 
to track) and one (most likely to track) and are similar to probability 
coefficients . 

Laboratory trackability tests were performed on a drilling machine. 
This apparatus is a vertical lathe engineered to accept a 6-ft diameter 
rock cylinder . . . 

4/3, K/8 (Item 1 from file: 148) 

IS ALOG(R) File 148:Gale Group Trade & Industry DB 
The Gale Group. All rts. reserv. 

: \ 6 SUPPLIER NUMBER: 61030009 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Designing cellular manufacturing systems with dynamic part populations. 

WICKS, ELI N M . ; REASOR, RODERICK J. 
HE Transactions, 31, 1, 11 
Jan, 1999 

ISSN: 0740-817X LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 7082 LINE COUNT: 00625 

pool according to the integer portion of the expected number of 
copies. The fractional portions are normalized to sum to one and are 
treated as probabilities during the selection process to fill out the 
rest of the mating pool. A uniform number between zero and one is 
generated and the corresponding solution receives an additional copy in the 
mating pool. The probability of the solution receiving another copy is 
set equal to zero. This process continues, with each population member 



V 



receiving at most one additional copy, until the mating pool is full. In 
this research . . . 

4/3, K/9 (Item 2 from file: 148) 

DIALOG ( R) File 148:Gale Group Trade & Industry DB 
(c)2004 The Gale Group. All rts. reserv. 
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... as : 

( t . sub . I ) ( p/i ~ acquired) / ( f . sub . 2 ) (p/i = unacquired) } (greater 
•:. =:. . : t^ua 1 to) ( A ) 

Following Palepu's (1986) procedure, the probability of acquisition 
; .-:r.puced for each firm in che estimation sample, based on the generated 
; • : . ;. model. The observations are grouped into ten equal intervals 
according co the probability of acquisition (refer to Table 3) . For Model 
1 , which is based on Palepu's original variable set , the probabilities 
range from zero to 0 .690. The number of acquired and unacquired 
firms falling within each interval is expressed as a percentage of the 
tonal of acquired and unacquired firms within the sample, respectively. In 
Figure 1, the midpoint of each probability interval for the acquired 
firms is plotted against the percentage of acquired firms in the interval. 
A similar plot is produced for the unacquired firms... 
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previous step, and they expire at the next time step. Thus, options 
with different maturities can be incorporated; but in order to obtain the 
required set of option prices, extensive interpolation and extrapolation 
using the observed option prices is needed. 

Also, the tree is not necessarily arbitrage- free, since negative 
probabilities can occur. These negative probabilities have to be reset 
to values between zero and one in an ad hoc fashion. As a result, 
Derman and Kani (1994) trees become numerically unstable, especially when 
the number of steps is large. 

For. . . 
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9 1.4 

{ • j Quartiles . ((dagger)) 95th centile at baseline. (9) 

We hypothesised that a higher excretion of the indicator at the baseline 
examination increased the probability 
of an unfavourable outcome, and 

conversely. Thus, an ideal likelihood ratio should be 0 in the lowest 
probability group 
f Lowest quartile) and infinite odds ratio in the highest 

probability group 
.highest quartile). In practice, a test is of no value when 

ii has a likelihood ratio of 1.0, likelihood ratios between 0 
. : a;.d 2.0 are 

of doubtful significance, and values greater than 10 or less than 0 
. 1 for the 

high and low probability 
category, respectively, indicate that the test has 
a good discrimatory power. (16) 

Table 4: Likelihood ratios 

Men 

Participants (n=208) 

Age (years) 

Overall 45.2... 
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NPV) is calculated. This NPV is extracted after each run. (Table 1 
shows the calculation of the NPV on one run.) All NPVs are then set in 
decreasing order and matched with their rank ( one to 1 ,000). Each rank 
is then divided by 1 ,000, the total number of runs, giving 1 ,000 
percentage numbers varying between 0 % and 100%. 

The final step constructs a curve using the range of possible 
appraised values (the NPV results of the runs) as the X-axis and the 
corresponding 0 %-100% values as the Y-axis (ILLUSTRATION FOR FIGURE 1 
OMITTED) . This curve will be unique to the analysis conducted because the 
probability distributions chosen for each factor are unique to that 
analysis. In mathematical terms, the curve represents a cumulative 
distribution function. For each NPV chosen, the... 
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... vectors of dimension 16. Furthermore, because in our analysis only 

monotone patterns of response were possible, the summation defining 
( (A. sub. i) . sup. (2) ) ( (Alpha) ) ranges only over the values of r in the 
set H={{0, 0, 0, 0),{1, 0, 0, 0),(1, 1 ,0,0), ( 1 , 1, 1 
,0)}. Notice that setting both < ( A . sub. i ) . sup . ( 1 ) ) ((Alpha)) and 
(d. sup. (2) ) ( (X. sub. i) ; (Beta)) identically equal to 0 amounts to separately 
estimating the nonresponse probabilities from the solution of 
( (Sigma) . sub. i) ( {A. sub. i) . sup. (2) ) ((Alpha)) = 0 and then estimating (Beta) 
from the solution to (9) with ((Pi... 



4/3,K/14 (Item 7 from file: 148) 

DIALOG (R) File 148: Gale Group Trade & Industry DB 
(c)2004 The Gale Group. All rts. reserv. 

10470773 SUPPLIER NUMBER: 21146562 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Competing Risk Analysis of Men Aged 55 to 74 Years at Diagnosis Managed 
Conservatively for Clinically Localized Prostate Cancer. 

Albertsen, Peter C; Hanley, James A.; Gleason, Donald F. ; Barry, Michael 
J . 

JAMA, The Journal of the American Medical Association, v280, nil, p975(l) 
Sept 16, 1S98 

ISSN: 0098-7484 LANGUAGE : English RECORD TYPE: Fulltext; Abstract 

WORD COUNT: 5826 LINE COUNT: 00588 

... ratio, 1.9; 95% confidence interval (CI), 1.6-2.2 after adjustment 

for age compared with patients who had few or no comorbidities) . The 
probability of dying from prostate cancer, however, was comparable 
between these 2 groups of patients (mortality rate ratio, 1 .26; 95% 
CI, 0 .95- 1 .69) . The number of patients with Charlson scores of 0 to 
1 and 2 or more is listed by patient age and Gleason score in Table 2. 

Preliminary analysis of the data also revealed a significant impact... 



4/3,K/15 (Item 8 from file: 148) 

DIALOG (R) File 148:Gale Group Trade & Industry DB 
(c)2004 The Gale Group. All rts. reserv. 

09025069 SUPPLIER NUMBER: 18765954 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Teenage employment and the spatial isolation of minority and poverty 
households. (Comment) 

0' Regan, Katherine M.; Quigley, John M. 
Journal of Human Resources, v31, n3, p692(ll) 
Summer, 1996 

ISSN: 0022-166X LANGUAGE: English RECORD TYPE: Fulltext; Abstract 

WORD COUNT: 4166 LINE COUNT: 00383 

. . . sub.ij) is the exposure of the ith group to members of group j. 

.-..sub. it ) and (n.sub.jt) are the number of group i and group j people 

' t ficL l , (N.sub.i) is the total number of group i people in the MSA, 
it.d (N.sub.t) is the total number of people in tract t. The index number 
, which ranges from 0 to 1 , measures the probability , for the 
average member of group i, that a randomly picked resident of his or her 
census tract is a member of group j . 

Social isolation of minority households decreases their contact with 
both non-minority (white) and nonpoor households. We presume that exposure 
to whites, who have... 
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. . . levels are 500 budget units, 50 researchers, 150 units of X and 200 

units of Y . 

The constraint levels are increased for each successive problem- set 
size to reflect a more realistic selection process. As the problem size 
increases (i.e., number of projects under consideration) it is likely that 
che number of projects selected will also increase. 

For each project, values for budget, profit (return), probability of 
success, number of researchers, facilities required and duration of each 
facility, and the sequence of facilities are generated randomly by the 
Monte Carlo method. The budget requirement for individual projects is 
defined by a uniform distribution between 1 and 50 units. Profit 
'reuim) is computed by generating a random value between 1 and 5 and 
multiplying this number by the budget value . The probability of 
success is described by a uniform distribution from 0 . 5 to 1.0. The 
number of researchers is defined by a uniform (integer) distribution from 1 
to 5. The number and sequence of facilities required and the duration... 
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. . . and lower control limits are 3 and -3, respectively. The zones and 

subgroup averages for this chart are shown in Fig. 2. With a threshold 
value of 1 , the features extracted from the observed line segments or 
time series on the control chart are abcbab. If the threshold value is 
set at 0.5, the primitives extracted from this set are accbac. 
Assuming all subgroups are independent, the transition probability of the 
subgroup 3 observation being in zone 4 and subgroup 4 observation being in 
zone 5 is (Mathematical Expression Omitted) (Mathematical Expression 
nr ; \ ted) . The . . . 
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. . . The algorithm requires the tooth count and row diameter as inputs, 

applies pertinent engineering constraints (for example, minimum section 
between adjacent inserts), and produces a set of ranked antitracking 
insert Layouts. 

The tracking coefficient, called trackabili ty, is also used to compare 
manually chosen pitching schemes and determine the best selection for a 
given design. Trackability numbers range between zero (least likely 
to track) and one (most likely to track) and are similar to probability 
coe f f icient s . 



Experimental results 

Laboratory trackability tests were performed on a drilling machine. 
This apparatus is a vertical lathe engineered to accept a 6-ft diameter... 
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. . . C and D is shown in Figure 2. 

When using larger event trees that describe more complex processes, we 
can perform simulations using the calculated probabilities as inputs. 
Simulations use random numbers based on these inputs to emulate real-world 
probabilistic events. Each simulation, usually generated by computer, 
represents a possible loss scenario. Loss data resulting from a large 
number of these simulations could be grouped into ranges such as 0 

C 1 million, $ 1 million to $10 million and so on. In this way, we can 
■ rucr. discrete loss probability distributions for our exposure. 
- . • ' on probability -loss graphs, these distributions give us a more 
:■ i .t_--_e picture of the risk associated with a given exposure than merely 
. .. ■ ■ trig the outcomes of individual... 
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are then instantly classified in decreasing order using one command 
(in Microsoft Excel, "Sort"). Each value is given its rank in the 
decreasing order from 1 to 1 ,000; this rank is divided by the total 
number of values (or 1 ,000), producing a series of increasing 
percentages from 0 to 100%. The percentages, in effect the probabilities 
that a target will end up being worth more than a given price, are charted 
or. a horizontal axis and the values related to these odds are charted on a 

*-icai axis. Each set of percentages/values is represented as a dot, 
:-;;;era t inq the curve shown in Exhibit 2. 

We think that this curve is a significant pricing tool... 
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standard error of the difference, 4.6 percent, is therefore 
approximately equal to 0.62 percent. Since 4.6>2~0.62 or 4.6> 1 .24, the 
difference is significantly different from zero at the 0 .05 level. 

If instead of testing precisely one difference between two 
percentages or numbers at the 0 .05 level of significance, multiple 
sets of differences were tested, each at that level, then the probability 
of finding a significant difference when, in fact, there is no difference, 
will be larger than the 0 .05 level. The test described here is at the 0 
.05 level for a difference between two percentages or numbers . 

Approximations of the standard errors of the estimated percentage of 
persons and number of persons from the 1 -percent file are shown in 
tables I and II, respectively. These estimates were used to fit regression 
curves to provide estimates of approximate standard errors... 
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... Salmonella risk treatments. Note the relatively wide range in 

option price bids before trial 11. All six treatments were identical before 
trial 11. No objective probabilities were known at this point. 
Differences prior to trial 11 can therefore be attributed to differences 
among the six groups in terms of market prices and group dynamics. The 
mean values of trials 7 through 10 ranged from $ 0 . 4 4 to $ 1 . 32 . This 
range exceeds that obtained when alternative pathogens were used. 
Before trial 11, the subjects were given the information on the 
probability of illness. As expected, figure 3 reveals that average option 
price bids increased when participants discovered that there was a 1 in 
13.7 chance . . . 
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0.21 (0.20) 
living alone 
living with parents 
other 



-1 . 04 



0.40 (0.23) 1.71 
0.73 (0.28) 2.62 
reference group 



EDUCATION 



higher vocational/university 
other 



0.36 (0.21) 
reference group 



1.74 



LABOUR MARKET POSITION 



pursuing an education 

other {with job, unemployed) 



1.16 (0.34) 
reference group 



3.37 



housewife (husband) 
disabled 



-0.37 (0.22) 
-0.57 (0.35) 



- 1 .65 

- 1 .64 



HOUSE OWNERSHIP 



owner -occupier 
rented accommodation 



- 0 .21 (0 .15) 
reference group 



- 1 .35 



MARGINAL RATE 

race ( between 0 and 1 ) (a) 0 .75 ( 0 .42) 1 

Number of observations: 966. 
Standard errors in brackets. 



a Combined marginal rate of taxation, 
income-dependent social security 
Table 2. Probability of demand 

Constant 
t-value 



income-dependent rent subsidy and 
benefits . 

participation (logit), households 

-1.80 (0.27) 



HOUSEHOLD LIFE CYCLE 
couple with young children/ 

one adult with children 0.38... 
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... 5. Again, standard statistical packages usually have a component 

available to calculate tail probabilities based on the Poisson 
distribution . 

A fourth approach to calculating the probability of observing a 
given number of deaths is simulation. This allows one to estimate the 
actual distribution of outcomes likely in a set of patients, each of whom 
has a specific probability of dying. The approach is straightforward. A 
number is drawn from a uniform random distribution with a range from 0 
to 1 . If the number drawn is less than the patient's probability of 
dying, the patient is counted as dead, otherwise, alive. A new number is 
drawn for each patient in the hospital, thus obtaining an "observed" number 

: Lifaths given a single set of n random draws. If one then repeats this 
r: ' 'j s s K times, one can then count the number of times (i.e., simulations 
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... the null hypothesis. A second set of tests is performed to try to 

overcome these problems . 
TABLE 5 

Classification Table of the Analysis Sample 

PREDICTED GROUP 
MEMBERSHIP 



FIRMS 
WITHOUT 
CONVERTIBLE 
29 
2 

0 .5 
the IZ.sub. 



FIRMS 
WITH 
CONVERTIBLE 
13 



ACTUAL GROUP MEMBERSHIP 
FIRMS WITHOUT CONVERT. 
FIRMS WITH CONVERTIBLE 

PRIOR PROBABILITY 0.5 0.5 

+ Values that fall inside the IZ.sub. 0 .025 

critical range . 
TABLE 6 

Classification Table {Using SALES , INTCOV, CAPEXP, SHARED) 

PREDICTED GROUP 



Z- 

STATISTIC 
2.47 (*) 
1 .134 



MEMBERSHIP 
FIRMS FIRMS 



ACTUAL GROUP 
MEMBERSHIP 

FIRMS WITHOUT CONVERT. 
FIRMS WITH CONVERT. 
PRIOR PROBABILITY 
For this set of tests, 



WITH 
CONVERTIBLE 
7 
11 

0.5 



TOTAL 
18 
18 



WITHOUT 
CONVERTIBLE 
11 
7 

0.5 

the same criterion described in the sample 
section is used except for three changes. The first is that an equal sample 
size is required from both groups --firms with convertible debt and those 
without. The second is that only the variables SALES, INTCOV, CAPEXP and 
SHARED are used. These are the variables... 
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. . . or ordinary least squares regression. The logit model is a specific 

: f.C a logistic function in which the dependent variable is interpreted 
-is "he probability of making a certain choice or falling into one 
group or another. In a logit model, the cumulative probability function 
has a maximum value of one since all probabilities lie between 0 
and 1 . Because in most cases it is inappropriate to assume that the 
independent variables bear a linear relationship to the probability of a 
given choice or group membership, the logit model assumes a nonlinear 
form. (For more extensive discussions of logit modeling, see[l, 2, 13].) 
The general logit model is represented. . . 
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...TEXT: on Q. Thus if the probability of a sell-through price is high, 
chat implies an overall, highly elastic demand for the video. If the 
probability is low, that implies a less elastic overall demand for the 
video. As shown in note [1], profit is maximized when P = MC/ { 1 1 
/Omega). Given a marginal cose of a video of $4, price can be set based on 
values of 52. If, for instance, the calculated probability for a video is 
between 0 and 0 .33, Omega must be low. If the calculated probability 
is between 0 .33 and 0 .67, Omega must be higher. If the probability 
is between 0 .67 and 1.0, Omega must be even higher. 

To-operacionalize this pricing rule we need a range for Omega. Given that 
P - MC/ C - 1/Omega), a sell-through price of $20.00 is consistent with an 
eiasticity (Omega) of 1.25. A... 
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...TEXT: multi-vehicle outbreak. If four of the omitted persons were moved 
: : orn che "not ill/not consumed" band to the "not ill/consumed" band, thep- 
value increases to 0 . 1 , which is not significant. Thus, manipulation 
of che figures within the range of error afforded by omitted returns has 
varied the probability value from 0 .002 to 0.1. 

in any event, the use of Fisher's Exact Test and the calculation of values 
lot only one set of foods presented problems, in that no comparative 
probability values for other foods could be estimated. It may have been 
possible to have manipulated sequences of figures to show one or other 
foods as . . . 
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...TEXT: and one chance in a hundred (1/100 or .01). Probability values 
within this range are assigned a degree of fit, or membership, in this set 
based on our level of confidence in that assessment. Due to knowledge 
imperfections, the absolute highest level of confidence we can assign must 
be shared by the interval between one in five hundred ( 1 /500 or .002) 
and one in two hundred { 1 /200 or .005). Values on the fringes of this 
confidence interval are assigned membership degrees between 0 and 1 . 

Note that this fuzzy probability gives us a far better feel for the 
uncertainties involved than a simple linguistic assessment such as 
"unlikely" or "improbable." We can, for example, compute... 
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Time limits on welfare receipt 
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...TEXT: individual, ignoring demographic characteristics for the time 
being. An individual is followed after her first quarter on welfare. The 
model provides an estimate of the probability an individual will stay on 
welfare after one quarter of, say, 25%. After the first quarter, a random 

number between 0 and 1 is generated. If the number falls below 

25%, the individual continues on welfare. If it does not, the individual 
exits from welfare. This procedure is repeated quarter by quarter, using 
estimates of the probability that the spell will last until quarter q + 
1, given that it has already lasted q quarters. When the individual 
finally leaves welfare, a different set of parameter estimates is used to 
predict: the likelihood of an individual returning to welfare. The process 

-o'lunues, alternating spells on welfare with spells off... 
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Modeling earnings expectations based on clusters of analyst forecasts 

Mozes, Haim A; Williams, Patricia A 
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WORD COUNT: 5200 

...TEXT: mean and the consensus. It may be possible, however, to construct 
a superior forecast that is a weighted average of the current and the 
previous cluster means. We attempt to construct such an expectation 
measure as follows. First, we estimate Equation (2), where BETTERP equals 
one if the current cluster mean is more accurate than the previous 
cluster mean, and BETTERP equals zero otherwise: (Formula Omitted) 



The logistic estimate for BETTERP is a 

one , denoted by p. 

probability that the 
previous cluster mean, 

probability that the 
current cluster mean. 
; Graph Omitted) 

" j:,: Loned as : EXHIBIT 1 



probability between zero and 

The estimated value for p represents the 

current cluster mean is more accurate than the 

and the estimated value for (1- p) represents the 

previous cluster mean is more accurate than the 



i 1 BIT 2 



(Graph Omitted) 
Captioned as: EXHIBIT 3 

Then, we compute an adjusted cluster mean, by weighting the current 
cluster mean by p and the previous cluster mean by (1- p) . If the current 
cluster mean is more accurate than the adjusted mean, this would. . . 
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...TEXT: female sex workers. They were dichotomised where necessary and, 
following a series of chi-squared tests, those significantly associated 
with psychological disturbance were identified and grouped into two broad 
categories: background and work-related variables (Table 1 ). Given the 
small numbers in the study, the 0 .15 probability level was used. 
Associations between psychological status and the sex worker's current 
agp, marital status, employment status prior to entering the sex industry 
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...TEXT: of years to another, the change in probabilities creates 
troublesome discontinuities in values and graphs. As a result, the author 
modified the values with a set of equations that smoothed the transitions 
from one age group to another. Other analysts may wish to do the same. 
For the example in this paper, equations for probabilities by age range 
are 45-49 Gamboa ' s values un change d; 50-59, Prob. = 1 .4891 0 
.01056Age; 60-67, Prob. = 4.0195 - 0.053 09 Ag e; 69-72, Prob. = 2.4256 - 
0.0293Age. 

Reference : 

References 

Re f e rence : 

Akerlof, George A. 1986... 
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...TEXT: corresponding decrease (or increase) in prior probabilities of the 
competing hypotheses. 8 

The notion that auditors represent hypotheses as independent entities, 
rather than a related set , suggests a positive association between the 
number of initial hypotheses and the sum of the initial probability 
ratings (P3B) . That is, the more hypotheses under consideration, the higher 
the sum of the initial probabilities . On the other hand, a complementary 
evaluation implies a fixed pool of probability and, therefore, no 
association between the number of hypotheses being considered and the 
summed probability ratings ( P3A) . The Pearson (Spearman) correlation 
between the number of hypothesis under consideration at iteration 1 



and the sum of the initial probabilities was 0 .627 (0.571), which is 
significant (p <= 0.0001 (0.0001)). This supports P3B and provides 
^.^i'- ional evidence that auditors represent hypotheses as independent 
er.r i t ies . . . 
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How to overcome inertia and get moving on bar codes 
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...ABSTRACT: to use bar codes, has the know-how, and has the necessary 
resources will soon be using them. By assigning each of these factors a 
value between zero and one and multiplying them together, the 

likelihood, in percent, that a particular industry of company will adopt 
the technology can be expressed. All 3 components are necessary to assure a 
high probability of adoption of automatic data collection . 
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Effect of a chipper-canter knife clamp on the quality of chips produced 
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...TEXT: the ovendry weight- to-green-volume ratio. The data were evaluated 
using analyses of variance and significance was reported for the 5 and 1 
percent probability levels. 

RESULTS AND DISCUSSION 

The black spruce logs yielded relatively homogeneous groups . The mean 
specific gravity for sapwood was 0 .43 1 and 0 .436 for heartwood; the 
difference between both values was insignificant. However, the MC of 
sapwood and heartwood was considerably different: 126 and 39 percent, 
respectively . 

The main results of this work are summarized... 
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...TEXT: Intuitively, a good fit is desirable since the recursive Kalman 
relations make decisions regarding the relative importance of the 
alternative models and determine the posterior probabilities as a 
function of the predictive density function. The prior settings are fairly 
diffuse, yet still proper, and the amount of sample data is large, making 



the impact of the prior settings within reasonable ranges fairly 
unimportant . 

The prior values for the parameters (m sub 0 ) in both models 1 and 2 
were set at 0.0 and the prior variances {C sub 0 ) for these parameters 
were set at 0.025{I sub k ) phi, where phi is... 



4/3,K/38 {Item 1 from file: 647) 

DIALOG (R) File 647: CMP Computer Fulltext 
(c) 2004 CMP Media, LLC. All rts. reserv. 

00631551 CMP ACCESSION NUMBER: EET19890109S4 351 

TRADITIONAL LOGIC CALLS THE SITUATION BLACK AND WHITE, BUT. . . Fuzzy- 
logic says it's a matter of degree 

R. COLIN JOHNSON 

ELECTRONIC ENGINEERING TIMES, 1989, n 520, 68 
PUBLICATION DATE: 890109 

JOURNAL CODE: EET LANGUAGE: English 

RECORD TYPE: Fulltext 
SECTION HEADING: 520PG68 
w.jk:: COUNT: 1151 

. . . impose its crisp categories on the real world, because the real 

world is not crisp. 

Kosko visualizes fuzzy logic as "a natural filling-in of set 
theory." He graphically depicts his geometrical filling-in of the unit 
hypercube by drawing a 2-dimensional cube (square), in which traditional 
logic only allows values at the vertices; for instance, { 0 , 0 ), { 0 , 
1 ) , (1,0,) and ( 1 , 1 ) . But fuzzy logic allows any analog value 
within the range of 0 to 1 to represent a situation; for example, (. 1 
.8), (.5, .5), (.2, .7) or any other combination. "Any point inside the 
unit hypercube is a fuzzy set , " Kosko said. Probability theory also 
fills in the cube, but only on a plane intersecting (0,0,1), ( 0,1,0,), 
(1,0,0) for a 3-D. . . 
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... and the pursuit of "fuzzy" laws in the social sciences. 

The falling-shadow method shows how to collect raw data and score 
it, resulting in set - valued s tat is tics- that is, a range of 
frequencies over the fuzzy interval between 0 and 1 , rather than 
just a single- point frequency (as with conventional probability theory). 

For instance, probability theory might determine that 96 percent of 
the people think 21-year-olds are young, where set -valued statistics 
would determine that 96 percent of the people think that 17-year-olds to 
27-year-olds are young. Determining set-valued statistics... 
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